chep / copilot-chat.el

Chat with Github copilot in Emacs !
MIT License
61 stars 7 forks source link

Tolerate truncation of `data: ` prefix in curl streaming #17

Closed huonw closed 1 month ago

huonw commented 1 month ago

This fixes #16 by making curl-analyze-copilot-response more tolerant to the underlying data being truncated in weird spots. This should result in fewer/no error in process filter: Args out of range: "data:", 6, nil errors and dropped updates.


As you may recall, the protocol seems to send a series of lines like:

data: {"choices":[{...,"delta":{"content":"great"}}],...}

data: {"choices":[{...,"delta":{"content":"work"}}],...}

data: [DONE]

With the curl-based streaming, these are cut up into chunks at arbitrary boundaries, and passed to curl-analyze-copilot-response. The boundaries are truly arbitrary. For instance, in #16, there's an example where one string value ends with dat and then next one starts a: {...}. That is, the data: ... prefix is truncated, not just the data.

This PR makes the processing more resilient by pulling out the logic for parsing an individual line into a new copilot-chat--extract-segment function.

This function categorises each line-delimited "segment" as either 'empty, 'partial, 'done or the JSON data. This separation makes it easy to have a catch-all (t 'partial), and allows curl-analyze-copilot-response to call (setq copilot-chat-last-data segment) in just one place.

I've tested this locally and it seems like copilot is no longer dropping tokens at all.

chep commented 1 month ago

Thank you, I can't test this week. I will have a look next week.