Open maurobender opened 1 month ago
I tried using the env variable TIKTOKEN_CACHE_DIR
pointing to /tmp
since it seems it can be used to setup the
temporary cache dir in here with no success, it still creates the temp file in the read only filesystem.
We need the new Tiktoken for GPT-4o model, so this new requirement will stay.
You can always use a previous release of PR-Agent without the new Tiktoken, but this seems to be a problem of AWS lambda that should be solved, one way or another. Maybe contact them, or try other workarounds.
If you do find a workaround, share and we will add it to the docs.
I did some debugging trying to find a workaround and the only one I found was setting TIKTOKEN_CACHE_DIR
but that didn't work because litellm is overwriting that environment variable (see this PR https://github.com/BerriAI/litellm/pull/1947) so I'm unable to pass a writable directory for AWS Lamda.
This issue is clearly related to this other issue in litellm: https://github.com/BerriAI/litellm/issues/2607.
I'll also report it in that issue to see if there is any fix coming. Until this is fixed I guess AWS Lamda users will not be able to use the latest gpt-4o
model with the pr agent.
Any update to this? I'm running into the same issue here. gpt-4o
was working fine via lambda and litellm up until today for me without any changes. I tried implementing the TIKTOKEN_CACHE_DIR=/tmp
like @maurobender suggested, but still no luck (I see the PR is closed - so maybe I'm setting that up wrong?)
full stacktrace and implementation for details:
[Errno 30] Read-only file system: '/var/task/litellm/llms/tokenizers/fb374d419588a4632f3f557e76b4b70aebbca790.34e78ae1-d4b3-4aac-9531-e4d4ee7de201.tmp'
from litellm import completion
...
def _get_openai_response(self, messages: List[dict]):
stream = completion(
model="gpt-4o",
messages=messages,
stream=True,
stop=None,
)
for chunk in stream:
response_content = chunk.choices[0].delta.content
if response_content:
yield response_content
try:
messages = self._prepare_messages(data)
for response_content in self._get_openai_response(messages):
self._update_responses(data, response_content)
self.s3_service.create_or_update_request(data)
complete_response += response_content
# once done, set status to completed and save to s3
data.status = RequestStatus.COMPLETE
self.s3_service.create_or_update_request(data)
except Exception as e:
logging.error(f"Error in querying OpenAI: {e}")
raise RuntimeError(f"Error in querying OpenAI: {e}")
@mrT23 I think we can fix it by updating litellm from 1.31.10 to v1.40.9 and set CUSTOM_TIKTOKEN_CACHE_DIR (not TIKTOKEN_CACHE_DIR ) to /tmp
https://github.com/BerriAI/litellm/issues/2607#issuecomment-2162860092
https://github.com/BerriAI/litellm/pull/4119 released on litellm v1.40.9 We can use environment variable CUSTOM_TIKTOKEN_CACHE_DIR
https://github.com/Codium-ai/pr-agent/blob/v0.22/requirements.txt#L12
litellm==1.31.10
@mrT23 This was marked as fixed but I don't see a commit changing litellm
as suggested.
Will this still be changed or is it not likely to happen soon?
@s1moe2 My bad. We had a PR to upgrade litellm last week, but it did not reach that version.
anyway, not trivial to upgrade to latest litellm, but I think I was able to solve all the dependencies https://github.com/Codium-ai/pr-agent/pull/989
We deployed a new docker. You can retry using this suggestion: https://github.com/Codium-ai/pr-agent/issues/909#issuecomment-2162887300 And let use know if it works now
Using latest version from main with model
gpt-4o
throws the following error when running on AWS Lamda:Stack trace
requirements.txt