microsoft / BatteryML

MIT License
461 stars 94 forks source link

Using the model to predict after train and evaluation #31

Open Compphsy opened 3 months ago

Compphsy commented 3 months ago

Hi,

I was wondering if anyone could give instructions of using the model after it has been trained?

I would like to test 2 trained models against each other using the same dataset, after they have been trained.

Like train a few models on CLO and then test using the HUST/CALCE data etc.

agiamason commented 3 months ago

We are very sorry that you have encountered this problem, our current code supports this kind of operation, you can refer to the following code to complete your experiments:

# import pipeline
from batteryml.pipeline import Pipeline

# create first pipeline to train model on CLO
pipeline1 = Pipeline(config_path='configs/baselines/sklearn/variance_model/clo.yaml', workspace='workspaces')

# train model on CLO dataset
model, dataset = pipeline1.train(device='cuda', skip_if_executed=False)

# Then, you will find a new .ckpt file under your workspaces folder

# create second pipeline to evaluate model on HUST
pipeline2 = Pipeline(config_path='configs/baselines/sklearn/variance_model/hust.yaml', workspace='workspaces')

# evaluate model on HUST dataset, replace the ckpt_to_resume with the model checkpoint of pipeline1, and you will get the RMSE on HUST test data
pipeline2.evaluate( ckpt_to_resume='./workspaces/20240423132818.ckpt', skip_if_executed=False)

I hope my answer has solved your problem, if you have any questions feel free to leave a comment.

We've been working on the doc lately, and we'd love to hear more from you!

Compphsy commented 3 months ago

Hi @agiamason,

I tried your implementation but I got the following error:

Seed is set to 0. Reading train data: 0it [00:00, ?it/s] Reading test data: 0it [00:00, ?it/s] Extracting features: 0it [00:00, ?it/s]

RuntimeError Traceback (most recent call last)

in () 2 3 # evaluate model on HUST dataset, replace the ckpt_to_resume with the model checkpoint of pipeline1, and you will get the RMSE on HUST test data ----> 4 pipeline2.evaluate( ckpt_to_resume='/content/BatteryML/workspaces/rf_clo_new/20240424213213.ckpt', skip_if_executed=False) 3 frames /content/BatteryML/batteryml/feature/base.py in _call_(self, cells) 18 for i, cell in enumerate(pbar): 19 features.append(self.process_cell(cell)) ---> 20 features = torch.stack(features) 21 return features.float() 22 RuntimeError: stack expects a non-empty TensorList
agiamason commented 3 months ago

This looks like your HUST data has not been successfully converted to feature, could you check if your HUST data has been successfully downloaded and preprocessed like the CLO data?

Cb824 commented 2 months ago

Hi @agiamason,

I have tried your implementation and it seemed to have worked, however when I tried to print the subsequent new prediction ground truth graph it looks identical to what was displayed before. I dont think I have done this correctly. I have a single model PLSR that was trained on HUST and CLO datasets.

# import pipeline
from batteryml.pipeline import Pipeline

# create first pipeline to train model on CLO
pipeline1 = Pipeline(config_path='configs/baselines/sklearn/plsr/hust.yaml', workspace='workspaces')

# train model on CLO dataset
# model, dataset = pipeline1.train(device='cuda', skip_if_executed=False)

# Then, you will find a new .ckpt file under your workspaces folder

# create second pipeline to evaluate model on HUST
pipeline2 = Pipeline(config_path='configs/baselines/sklearn/plsr/clo.yaml', workspace='workspaces')

# evaluate model on HUST dataset, replace the ckpt_to_resume with the model checkpoint of pipeline1, and you will get the RMSE on HUST test data
pipeline2.evaluate( ckpt_to_resume='/content/BatteryML/workspaces/plsr/20240501030343.ckpt', skip_if_executed=False)

# get raw data from pipeline
train_cells, test_cells = pipeline2.raw_data['train_cells'], pipeline2.raw_data['test_cells']
prediction2 = model.predict(dataset, data_type='test').to('cpu')
ground_truth2 = dataset.test_data.label.to('cpu')
plot_result(ground_truth2, prediction2,'plsr')
result.append([method, train_loss, test_loss])

Screenshot 2024-05-01 050204

Ideally I want to be able to load the pretrained models from my Gdrive and test them after initial training to compare, but the graph from the above has just led me to the same graph from the model trained on Hust instead of the updated prediction and ground truth values from the newly trained CLO data.

Thanks for any help you can give.

fingertap commented 1 month ago

however when I tried to print the subsequent new prediction ground truth graph it looks identical to what was displayed before.

Can you provide more details?