Closed samsja closed 1 year ago
We use the llama-7b in the current codebase to run training. https://github.com/jina-ai/jerboa/blob/6b1ea8bf14a75cfc6d7fb095f7d1b60258afb52f/alpaca-lora/finetune.py#L113
llama-7b
Nevertheless, the 7b model is quite big and is not suited to run test to check the codebase.
It would be nice to have a tiny LLama model, not trained, that can serve debugging purposes.
Ideally, we would jus take the LLama7b model and keep 2 heads, and/or reduce the embedding size.
The goal would be to be able to even run the script (in debug mode) on CPU in a dozens of seconds with this tiny model.
Context
We use the
llama-7b
in the current codebase to run training. https://github.com/jina-ai/jerboa/blob/6b1ea8bf14a75cfc6d7fb095f7d1b60258afb52f/alpaca-lora/finetune.py#L113Nevertheless, the 7b model is quite big and is not suited to run test to check the codebase.
It would be nice to have a tiny LLama model, not trained, that can serve debugging purposes.
Ideally, we would jus take the LLama7b model and keep 2 heads, and/or reduce the embedding size.
The goal would be to be able to even run the script (in debug mode) on CPU in a dozens of seconds with this tiny model.