Tiktoken 0.7.0 read-only system error in AWS Lamda

maurobender commented 1 month ago

Using latest version from main with model gpt-4o throws the following error when running on AWS Lamda:

Failed to clip tokens: [Errno 30] Read-only file system: '/var/lang/lib/python3.10/site-packages/litellm/llms/tokenizers/fb374d419588a4632f3f557e76b4b70aebbca790.3633445e-4ab0-4767-8f6a-0cd5fd32eb79.tmp'\n

Stack trace

[ERROR] 2024-05-17T13:42:14.431Z    9c2023d9-ecea-430c-bb8c-b115540613c0    An error occurred running the application.
--
Traceback (most recent call last):
File "/var/lang/lib/python3.10/site-packages/mangum/protocols/http.py", line 58, in run
await app(self.scope, self.receive, self.send)
File "/var/lang/lib/python3.10/site-packages/fastapi/applications.py", line 290, in __call__
await super().__call__(scope, receive, send)
File "/var/lang/lib/python3.10/site-packages/starlette/applications.py", line 122, in __call__
await self.middleware_stack(scope, receive, send)
File "/var/lang/lib/python3.10/site-packages/starlette/middleware/errors.py", line 184, in __call__
raise exc
File "/var/lang/lib/python3.10/site-packages/starlette/middleware/errors.py", line 162, in __call__
await self.app(scope, receive, _send)
File "/var/lang/lib/python3.10/site-packages/starlette_context/middleware/raw_middleware.py", line 92, in __call__
await self.app(scope, receive, send_wrapper)
File "/var/lang/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 79, in __call__
raise exc
File "/var/lang/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 68, in __call__
await self.app(scope, receive, sender)
File "/var/lang/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 20, in __call__
raise e
File "/var/lang/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 17, in __call__
await self.app(scope, receive, send)
File "/var/lang/lib/python3.10/site-packages/starlette/routing.py", line 718, in __call__
await route.handle(scope, receive, send)
File "/var/lang/lib/python3.10/site-packages/starlette/routing.py", line 276, in handle
await self.app(scope, receive, send)
File "/var/lang/lib/python3.10/site-packages/starlette/routing.py", line 66, in app
response = await func(request)
File "/var/lang/lib/python3.10/site-packages/fastapi/routing.py", line 241, in app
raw_response = await run_endpoint_function(
File "/var/lang/lib/python3.10/site-packages/fastapi/routing.py", line 167, in run_endpoint_function
return await dependant.call(**values)
File "/var/task/pr_agent/servers/github_app.py", line 51, in handle_github_webhooks
response = await handle_request(body, event=request.headers.get("X-GitHub-Event", None))
File "/var/task/pr_agent/servers/github_app.py", line 273, in handle_request
await handle_comments_on_pr(body, event, sender, sender_id, action, log_context, agent)
File "/var/task/pr_agent/servers/github_app.py", line 117, in handle_comments_on_pr
await agent.handle_request(api_url, comment_body,
File "/var/task/pr_agent/agent/pr_agent.py", line 93, in handle_request
await command2class[action](pr_url, ai_handler=self.ai_handler, args=args).run()
File "/var/task/pr_agent/tools/pr_reviewer.py", line 75, in __init__
self.token_handler = TokenHandler(
File "/var/task/pr_agent/algo/token_handler.py", line 47, in __init__
self.encoder = TokenEncoder.get_token_encoder()
File "/var/task/pr_agent/algo/token_handler.py", line 19, in get_token_encoder
cls._encoder_instance = encoding_for_model(cls._model) if "gpt" in cls._model else get_encoding(
File "/var/lang/lib/python3.10/site-packages/tiktoken/model.py", line 103, in encoding_for_model
return get_encoding(encoding_name_for_model(model_name))
File "/var/lang/lib/python3.10/site-packages/tiktoken/registry.py", line 73, in get_encoding
enc = Encoding(**constructor())
File "/var/lang/lib/python3.10/site-packages/tiktoken_ext/openai_public.py", line 92, in o200k_base
mergeable_ranks = load_tiktoken_bpe(
File "/var/lang/lib/python3.10/site-packages/tiktoken/load.py", line 147, in load_tiktoken_bpe
contents = read_file_cached(tiktoken_bpe_file, expected_hash)
File "/var/lang/lib/python3.10/site-packages/tiktoken/load.py", line 74, in read_file_cached
with open(tmp_filename, "wb") as f:
OSError: [Errno 30] Read-only file system: '/var/lang/lib/python3.10/site-packages/litellm/llms/tokenizers/fb374d419588a4632f3f557e76b4b70aebbca790.e359bd26-b7da-4e24-9920-26bb937d278e.tmp'

requirements.txt

aiohttp==3.9.1
atlassian-python-api==3.41.4
azure-devops==7.1.0b3
azure-identity==1.15.0
boto3==1.33.6
dynaconf==3.2.4
fastapi==0.99.0
GitPython==3.1.32
google-cloud-aiplatform==1.35.0
google-cloud-storage==2.10.0
Jinja2==3.1.2
litellm==1.31.10
loguru==0.7.2
msrest==0.7.1
openai==1.13.3
pytest==7.4.0
PyGithub==1.59.*
PyYAML==6.0.1
python-gitlab==3.15.0
retry==0.9.2
starlette-context==0.3.6
tiktoken==0.7.0
ujson==5.8.0
uvicorn==0.22.0
tenacity==8.2.3

maurobender commented 1 month ago

I tried using the env variable TIKTOKEN_CACHE_DIR pointing to /tmp since it seems it can be used to setup the temporary cache dir in here with no success, it still creates the temp file in the read only filesystem.

mrT23 commented 1 month ago

We need the new Tiktoken for GPT-4o model, so this new requirement will stay.

You can always use a previous release of PR-Agent without the new Tiktoken, but this seems to be a problem of AWS lambda that should be solved, one way or another. Maybe contact them, or try other workarounds.

If you do find a workaround, share and we will add it to the docs.

maurobender commented 1 month ago

I did some debugging trying to find a workaround and the only one I found was setting TIKTOKEN_CACHE_DIR but that didn't work because litellm is overwriting that environment variable (see this PR https://github.com/BerriAI/litellm/pull/1947) so I'm unable to pass a writable directory for AWS Lamda.

This issue is clearly related to this other issue in litellm: https://github.com/BerriAI/litellm/issues/2607.

I'll also report it in that issue to see if there is any fix coming. Until this is fixed I guess AWS Lamda users will not be able to use the latest gpt-4o model with the pr agent.

chanzer0 commented 1 month ago

Any update to this? I'm running into the same issue here. gpt-4o was working fine via lambda and litellm up until today for me without any changes. I tried implementing the TIKTOKEN_CACHE_DIR=/tmp like @maurobender suggested, but still no luck (I see the PR is closed - so maybe I'm setting that up wrong?)

full stacktrace and implementation for details:

[Errno 30] Read-only file system: '/var/task/litellm/llms/tokenizers/fb374d419588a4632f3f557e76b4b70aebbca790.34e78ae1-d4b3-4aac-9531-e4d4ee7de201.tmp'

from litellm import completion
...

def _get_openai_response(self, messages: List[dict]):
        stream = completion(
            model="gpt-4o",
            messages=messages,
            stream=True,
            stop=None,
        )
        for chunk in stream:
            response_content = chunk.choices[0].delta.content
            if response_content:
                yield response_content

try:
    messages = self._prepare_messages(data)
    for response_content in self._get_openai_response(messages):
        self._update_responses(data, response_content)
        self.s3_service.create_or_update_request(data)
        complete_response += response_content

    # once done, set status to completed and save to s3
    data.status = RequestStatus.COMPLETE
    self.s3_service.create_or_update_request(data)

except Exception as e:
    logging.error(f"Error in querying OpenAI: {e}")
    raise RuntimeError(f"Error in querying OpenAI: {e}")

chai3 commented 3 weeks ago

@mrT23 I think we can fix it by updating litellm from 1.31.10 to v1.40.9 and set CUSTOM_TIKTOKEN_CACHE_DIR (not TIKTOKEN_CACHE_DIR ) to /tmp

https://github.com/BerriAI/litellm/issues/2607#issuecomment-2162860092

https://github.com/BerriAI/litellm/pull/4119 released on litellm v1.40.9 We can use environment variable CUSTOM_TIKTOKEN_CACHE_DIR

https://github.com/Codium-ai/pr-agent/blob/v0.22/requirements.txt#L12

litellm==1.31.10

s1moe2 commented 2 weeks ago

@mrT23 This was marked as fixed but I don't see a commit changing litellm as suggested. Will this still be changed or is it not likely to happen soon?

mrT23 commented 2 weeks ago

@s1moe2 My bad. We had a PR to upgrade litellm last week, but it did not reach that version.

anyway, not trivial to upgrade to latest litellm, but I think I was able to solve all the dependencies https://github.com/Codium-ai/pr-agent/pull/989

mrT23 commented 1 week ago

We deployed a new docker. You can retry using this suggestion: https://github.com/Codium-ai/pr-agent/issues/909#issuecomment-2162887300 And let use know if it works now

Codium-ai / pr-agent

Tiktoken 0.7.0 read-only system error in AWS Lamda #909