Fix for "argument list too long" error with large payloads in Curl

KarimAziev commented 7 months ago

Firstly, I'd like to express my gratitude for the excellent gptel package.

Recently, with the release of newer models boasting a 128k context (e.g., gpt-4-1106-preview), I attempted to utilize these capabilities with a sizable amount of data. However, I encountered a stumbling block when gptel-stream-response failed due to a Curl error stating that the "argument list [was] too long."

Upon investigation, I found that the issue arose from attempting to pass a large payload directly to Curl via command-line arguments. To overcome this limitation, I have implemented a solution that writes the JSON-encoded data to a temporary file and instructs Curl to transmit this data using the --data-binary flag, which points to the said file. This eliminates the problem of exceeding the maximum argument length imposed by most operating systems on command-line inputs.

Changes Made: To resolve this, the method of sending data to Curl has been modified. Instead of including the data directly in the command-line arguments using the -d option, the data is now written to a temporary JSON file. Curl is then instructed to read from this file using the --data-binary option.

Technical Details:

A temporary file is created using make-temp-file to hold the encoded JSON data.
The --data-binary Curl option is used to send the data by reading from the temporary file, thus avoiding the system's command-line length limitations.
The Curl command in the gptel-curl--get-args function is adjusted accordingly.

This change ensures that users can work with larger data payloads without encountering Curl's argument size constraints while sending requests to the larger and more advanced GPT-4 models.

Thank you once again for your efforts in maintaining this package.

karthink commented 7 months ago

Thanks for the PR! I'll take a look at it soon.

A general suggestion (having not examined the code yet): I'd like to avoid creating temporary files unless absolutely necessary -- especially since it's only an issue when the payload is large. Would you consider doing this conditionally, based on the data size?

KarimAziev commented 7 months ago

Sure, that's a good idea. I've made the following updates to address your feedback:

Added a new gptel-curl-args-file-threshold variable, defaulting to 130KB, to control when data gets written to a temporary file.
Modified the logic to write to a temporary file only if data exceeds this threshold, maintaining direct argument passing for smaller payloads.

karthink commented 7 months ago

The code looks great, including the detailed docstring. Thank you!

I think we can find a better name for gptel-curl-args-file-threshold, one that's marginally shorter and gives a better idea of its purpose, hopefully. What do you think of

gptel-curl-min-file-size
gptel-curl-file-size-threshold
gptel-curl-max-size

I'm open to other suggestions as well.

karthink commented 7 months ago

Oh, when using temp files we should also clean up after ourselves and delete the file. One way to do this would be to add (inside the conditional) a function to gptel-post-response-hook. This function should delete the temp file and remove itself from the hook.

KarimAziev commented 6 months ago

Thank you for the feedback!

I've made the following adjustments in response to your suggestions:

The variable gptel-curl-args-file-threshold has been renamed to gptel-curl-file-size-threshold to better reflect its purpose and enhance consistency in naming.
I have implemented a cleanup mechanism for temporary files created during Curl requests that exceed the size threshold. A lambda function is now dynamically added to the gptel-post-response-hook to ensure temporary files are removed after the Curl execution is complete. This function is designed to remove itself from the hook post-cleanup.

Hope these adjustments align with your expectations.

karthink commented 6 months ago

Looks great! Thank you @KarimAziev :+1:

karthink / gptel

Fix for "argument list too long" error with large payloads in Curl #137