I wish you could also provide Llama models in pte format for download. It would be really convenient for anyone that is not interested in fine-tuning the model to be able to run those models locally without 64gb of RAM. Exporting to binary is usually a one-off activity with a peak in memory demand. Many thanks!
I wish you could also provide Llama models in pte format for download. It would be really convenient for anyone that is not interested in fine-tuning the model to be able to run those models locally without 64gb of RAM. Exporting to binary is usually a one-off activity with a peak in memory demand. Many thanks!