unslothai / unsloth

Finetune Llama 3.2, Mistral, Phi, Qwen 2.5 & Gemma LLMs 2-5x faster with 80% less memory
https://unsloth.ai
Apache License 2.0
18.5k stars 1.3k forks source link

Megalodonian models #620

Open rezzie-rich opened 5 months ago

rezzie-rich commented 5 months ago

similar to mistral-fing LLM, it would be great if we could get Megalodonian models based on Meta's Megalodon.

https://github.com/XuezheMax/megalodon

it is said to be bad on recall. however, it should be a great fit for agent frameworks since agents tend to work better with higher context windows (in this case, unlimited) and most of them are integrated with a short and long-term memory system to help with recall.

danielhanchen commented 5 months ago

Hmm it'll be supported once we add all model automatic support!

rezzie-rich commented 3 months ago

https://huggingface.co/papers/2404.08801

Is there any ongoing plan to convert models with megalodon architecture?

This may not be much useful for a standalone llm interface, but for ai agents, this could be the biggest breakthrough as they have separate memory management.

danielhanchen commented 3 months ago

Currently not sorry :(