Problem on test set inference on a single machine with multiple GPUs

Hi, thank for your great work! Your code is based on the lightning torch. When i deployed the model on a single machine with multiple GPUs, it started several GLOBAL processes, which is necessary for training acceleration but raises a problem when testing. I planned to load a test set with a length of 1k for example, while the predictive results appeared to be with a quadruple length (using make_evaluation_predictions() ). I think it is the biggest reason for my very very slow inference which didnt happen on validation set. The document of lightning recommends using trainer(device=1) to test. I tried initializing a new trainer like below but raised a TypeError: model must be a LightningModule or torch._dynamo.OptimizedModule, got LagLlamaLightningModule. I dont know how to fix it now.

model = LagLlamaEstimator()
single_device_trainer = Trainer(devices=1,max_epochs=1)
pre_results=single_device_trainer.test(model=model.network, dataloaders=test_loader)

time-series-foundation-models / lag-llama

Problem on test set inference on a single machine with multiple GPUs #105