I am looking to run a local LLM (Large Language Model) on an Nvidia Jetson AGX Orin over the GPU CUDA Cores . Could anyone provide guidance or share resources on how to achieve this?
I was able to run a local LLM (.gguf model) over the CPU but unable to utilize the GPU.
I am looking to run a local LLM (Large Language Model) on an Nvidia Jetson AGX Orin over the GPU CUDA Cores . Could anyone provide guidance or share resources on how to achieve this?
I was able to run a local LLM (.gguf model) over the CPU but unable to utilize the GPU.
Thank you in advance for your help!