Inference model on CPU - Githubissues

timurnasyrow commented 4 months ago

First of thank you for your article and for sharing results I'am trying to inference model on CPU only device. And I modified model load code like below ckpt = torch.load("lag-llama.ckpt", map_location=torch.device('cpu')) But when execute lightning module still getting Error lightning_module = estimator.create_lightning_module()

RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.

guanarp commented 4 months ago

It has to do with the lightning module load from checkpoint. There's a PR with a quick solution, just change that line PR

ashok-arjun commented 3 months ago

Hi @timurnasyrow, apologies, the CPU support has been on the table for a long time but yet to be implemented. We'll implement it soon and let you know here. Thanks.

ashok-arjun commented 2 months ago

Hi, you can try it now. We now have support to pass the device to the estimator object: https://github.com/time-series-foundation-models/lag-llama/blob/1dbe107b6933332b2fbc9a46eda411c793573492/lag_llama/gluon/estimator.py#L144

You can use device=torch.device("cpu") for CPU.

time-series-foundation-models / lag-llama

Inference model on CPU #16