Mozilla-Ocho / llamafile

Distribute and run LLMs with a single file.
https://llamafile.ai
Other
19.27k stars 978 forks source link

Bug: Support Read-Only Filesystems #479

Closed metaskills closed 2 months ago

metaskills commented 2 months ago

Contact Details

ken@unremarkable.ai

What happened?

On systems such as AWS Lambda or secure K8s infrastructure, file-systems are not writable. I was able to disable logging, yet starting the server will always write to the $HOME/.llamafile directory. I can fix this on Lambda ( I think ) by setting $HOME in the starting script. Perhaps there should be something like a LLAMAFILE_HOME environment variable?

Version

Latest.

What operating system are you seeing the problem on?

No response

Relevant log output

No response

jart commented 2 months ago

Assuming you've installed APE Loader, llamafile should support a fully read-only filesystem on Linux. Could you please clarify what you mean? Are you talking about GPU support? Even if ~/.llamafile can't be accessed, I've confirmed locally that llamafile will fall back safely to CPU inference mode. One thing you could do is run llamafile on your Linux workstation and copy/install the .llamafile folder it creates onto your AWS servers. If it exists and matches then there'll be no need for llamafile to create it.

metaskills commented 2 months ago

Hey @jart thanks for the reply. Basically all I am doing is adding a .llamafile (ex: Phi-3-mini-4k-instruct.F16.llamafile) to a container and starting the JSON API. When doing so it will creat the ~/.llamafile and thus the source of my issue. I solved this by doing HOME=/tmp/llamafile before my command and that worked for me. I had no idea I could copy the files generated to the container. Both seem like acceptable solutions. Think I should close the issue?

jart commented 2 months ago

Are you passing the -ngl flag?

metaskills commented 2 months ago

Sorry for missing that, I am running something like this:

HOME=/tmp /opt/Phi-3-mini-4k-instruct.F16.llamafile \
  --nobrowser \
  --log-disable \
  --host 127.0.0.1 \
  --port 8080

No -ngl in there, I had not seen that setting before. 🤔

metaskills commented 2 months ago

OK, I just had a very cool Claude 3.5 Sonnet conversation with the -h output and I'm now seeing tons of configs that can help me with my Lambda project's cold starts or inference speed. But the configs above in my API server example along with the HOME workaround were the ones that felt more on topic to the read-only file system. Hope that helps. Thanks!

metaskills commented 2 months ago

I think I see what you're asking. If I do add -ngl 0 flag it will try to create a ~/.llamafile, but I was not pass that. Here is the read-only error for me.

/opt/Phi-3-mini-4k-instruct.Q2_K.llamafile: 69: cannot create ./.ape-1.10.34: Read-only file system
metaskills commented 2 months ago

I was able to avoid the ape install by following the gotchas when building my container.

RUN wget -O /usr/bin/ape https://cosmo.zip/pub/cosmos/bin/ape-$(uname -m).elf && \
    chmod +x /usr/bin/ape