Closed Dudu014 closed 4 months ago
I am currently using Jetson Nano Orin Developer Kit 8GB to run the LLaMA2_7B_chat_awq_int4.
Everything has built fine. However when running ./chat the process gets killed. By monitoring resources I can tell is due to running out of memory.
./chat
To improve performance I disable ZWAP and allocated SWAP memory onto the NVMe.
The performance improved but not enough to run the model.
Based on readme file, Jetson Nano Orin should be able to run the lightest model as per the news on 2024/01.
Is there anything I should consider to be able to run it? If not possible, is there any guidelines or way to load a lighter model into TinyChatEngine?
Thanks in advance!
I was able to run it after rebooting the Jetson Nano Orin, so problem solved.
I am currently using Jetson Nano Orin Developer Kit 8GB to run the LLaMA2_7B_chat_awq_int4.
Everything has built fine. However when running
./chat
the process gets killed. By monitoring resources I can tell is due to running out of memory.To improve performance I disable ZWAP and allocated SWAP memory onto the NVMe.
The performance improved but not enough to run the model.
Based on readme file, Jetson Nano Orin should be able to run the lightest model as per the news on 2024/01.
Is there anything I should consider to be able to run it? If not possible, is there any guidelines or way to load a lighter model into TinyChatEngine?
Thanks in advance!