alexrozanski / llama.swift

Fork of llama.cpp, supporting Facebook's LLaMA model in Swift
MIT License
157 stars 11 forks source link

Update llma.cpp to master-ea60007 #6

Closed Schaltfehler closed 1 year ago

Schaltfehler commented 1 year ago

Quantization formats have changed and current GMLL models can't be loaded anymore. This is updating llama.cpp code base to https://github.com/ggerganov/llama.cpp/commit/ea600071cb005267e9e8f2629c1e406dd5fde083 from May 20th.

Not sure if useful or if you have some other plans. Feel free to close if not needed :-)

alexrozanski commented 1 year ago

hey @Schaltfehler, thanks so much for doing this, this is great!

I haven't had much time to look at llama.cpp for the past couple of weeks but I think it would be best if this were to be merged into the LLaMA plugin for CameLLM instead, as that's where the main development is going to be happening from now on, and it's also what v2.0 of LlamaChat is being based on.

If you have some time to port this to the CameLLM-Llama repo that would be awesome, otherwise I'll try to do this soon.

Schaltfehler commented 1 year ago

Sure let me take a look and give it try. Just curious about the future of llama.swift do you plan to deprecate or archive this repro then? I thought it might be useful to keep this one close to llama.cpp. Btw soon coming swift 5.9 should bring in the first wave of swift cpp interoperability. That should make the whole c++ wrapping story much simpler :-)

alexrozanski commented 1 year ago

@Schaltfehler my plan was to archive this repo and move over to CameLLM but thinking about this more, it might be better to keep this as an alternative which is more closely aligned with llama.cpp as you say, I'll think on this some more! Having it listed as part of the llama.cpp forks is also quite valuable for discovery, and CameLLM's setup is a little more involved as the aim is to be able to run non-LLaMA architected models as well

alexrozanski commented 1 year ago

Looks like running this with some of the old models results in a C++ end of file exception — need to dig into this a bit more. There's been a lot of thrash with the llama.cpp file formats but ideally we shouldn't crash with older models