CoAuthor - Githubissues

CoAuthor https://coauthor.stanford.edu/ This dataset contains cursor-level interactions between human writers and GPT3 in an agent-assisted writing session. There is no given task, so we propose a new one: given the text written thus far and the start of a text insertion/deletion, predict the rest of the text-insert/delete block. This has pragmatic value – if GPT3 can better predict how humans will fill in the gaps, the user will use its suggestions more often, increasing productivity. X: (text written thus far, first inserted/deleted character, whether character was inserted or deleted) Y: rest of the inserted/deleted text Domains: Author ID (each author had multiple sessions, and each session has multiple text insertion/deletions) Prompt ID Domain Shifts: Covariate Shift: p(edits|author) doesn’t change but p(author does) Concept Shift: As an author becomes more familiar with interacting with GPT3, p(edits|author) may change as they anticipate agent behavior

shreyashankar / streams

CoAuthor #1