Open cat-state opened 11 months ago
Good point @cat-state. We may go even further by answering the question: who is the audience of this project?
I see three groups:
The first group may be mostly interested in the verification of authorship (Does this binary really come from a trusted researcher?). The second group wants to be aware that they are trading security for privacy (external REST-like API cannot execute arbitrary code). The last group may be interested in not installing ransomware while chatting with AI.
I think the easiest workaround is to put the executable inside a Docker container, but I'm unsure about the details.
Thank you for this feedback! We agree this is a concern. We’ve personally validated and can vouch for the example wrapped weights we provide with this project. Our plan next (soon!) is to add a tool that users can use to validate that their llamafile’s llama.cpp bits are from a source you trust. There is also functionality in Cosmopolitan (pledge(), specifically) that we are considering using to further secure things. We will update this issue soon with more details.
Just adding to this... the current llamafile-server-0.2.1 exe file is showing up as a malicious file when scanned with VirusTotal. I would like to try it out... but wont trust it until this is cleared up.
https://www.virustotal.com/gui/file/2b3c692e50d903cbf6ac3d8908f8394101b5be5f8a4573b472975fa8c9f09e68
@geoffsmith82 Only 7 out of 70 AVs are reporting false positives. That's actually a pretty good score. Please consider that llamafile uses a novel polyglot executable format we designed in order to help more people run LLMs on their personal computers. If you need to wait for a consensus to emerge in the AV industry that polyglot file formats are good, before you can try llamafile, then you could be waiting for quite some time. We're happy to file reports with AVs that have submission forms like Microsoft, so please tell us if you ever have issues with Windows Defender (which is green in the link you provided) but I can't offer you any assurances regarding the other 69.
FYI, AVG reported llamafile-server-0.3 as infected with FileRepMalware (even though AVG wasn't detected in the above report).
I sent a false positive report to them with a link to this issue and a reference to the polyglot file format.
@JoshuaCWebDeveloper If AVG white-labels Windows Defender then try updating to the latest malware definitions. I submitted the 0.3 release to Microsoft Security Intelligence earlier today (see https://www.microsoft.com/en-us/wdsi/submission/a81e9778-a046-46c9-8221-ed18ede17850) due to a Windows Defender issue. The ticket got resolved. Therefore I believe AV issues with 0.3 to be resolved.
@cat-state hows this ticket, have your question been solved?
Llamafile is a great convenience by bundling the inference code with the weights. However, it offers less security to users in its current form than the use of untrusted safetensors/gguf weights + seperately downloaded (trusted) model impl. If llamafile takes off, users will be executing random executables downloaded from HF generated by random people, presenting a security hole through people including backdoored llama.cpp implementations.
What would be the best way to address these security concerns?
wget huggingface.com/.../xyz.llamafile && ./xyz.llamafile
withcurl llamaup.com/up | bash
andllamafile huggingface.com/.../xyz.gguf
, like the "seperately downloaded llamafile server" example in the docs.