timeseriesAI / tsai

Time series Timeseries Deep Learning Machine Learning Python Pytorch fastai | State-of-the-art Deep Learning library for Time Series and Sequences in Pytorch / fastai
https://timeseriesai.github.io/tsai/
Apache License 2.0
5.07k stars 633 forks source link

Can´t load learner #702

Closed Unreal9er closed 1 year ago

Unreal9er commented 1 year ago

Hello tsai-community,

I currently try to build a live forecasting pipeline with tsai. From Colab everything works fine but I can´t get it running locally.

I get the following Error:

grafik

I already tried the solves online, even used different machines, reinstalled python and my enviroments but couldn´t solve the issue in 4 hours.

I really appreciate any help.


I tried on two different Win 10 PCs

Here ist the complete Code:

%pip install tsai -U

import pathlib temp = pathlib.PosixPath pathlib.PosixPath = pathlib.WindowsPath

from tsai.all import *

PATH = Path('learner.pkl') learn = load_learner(PATH, cpu=False)


ModuleNotFoundError Traceback (most recent call last) Cell In[4], line 2 1 PATH = Path('learner.pkl') ----> 2 learn = load_learner(PATH, cpu=False)

File ~\AppData\Roaming\Python\Python39\site-packages\fastai\learner.py:446, in load_learner(fname, cpu, pickle_module) 444 distrib_barrier() 445 map_loc = 'cpu' if cpu else default_device() --> 446 try: res = torch.load(fname, map_location=map_loc, pickle_module=pickle_module) 447 except AttributeError as e: 448 e.args = [f"Custom classes or functions exported with your Learner not available in namespace.\Re-declare/import before loading:\n\t{e.args[0]}"]

File ~\AppData\Roaming\Python\Python39\site-packages\torch\serialization.py:809, in load(f, map_location, pickle_module, weights_only, pickle_load_args) 807 except RuntimeError as e: 808 raise pickle.UnpicklingError(UNSAFE_MESSAGE + str(e)) from None --> 809 return _load(opened_zipfile, map_location, pickle_module, pickle_load_args) 810 if weights_only: 811 try:

File ~\AppData\Roaming\Python\Python39\site-packages\torch\serialization.py:1172, in _load(zip_file, map_location, pickle_module, pickle_file, pickle_load_args) 1170 unpickler = UnpicklerWrapper(data_file, pickle_load_args) 1171 unpickler.persistent_load = persistent_load -> 1172 result = unpickler.load() 1174 torch._utils._validate_loaded_sparse_tensors() ... 1163 pass 1164 mod_name = load_module_mapping.get(mod_name, mod_name) -> 1165 return super().find_class(mod_name, name)

ModuleNotFoundError: No module named 'ipykernel.codeutil'

oguiza commented 1 year ago

Hi @Unreal9er, Are you using the same tsai version for training and inference?

Unreal9er commented 1 year ago

Hi @oguiza, yes I always use the newest Stable Version.

oguiza commented 1 year ago

I've never seen this type of issue before. I'm not sure what's causing this. Based on the reported error, it doesn't seem to be a tsai-related error. I've checked the fastai issue tracker and there's a somewhat open issue: https://github.com/fastai/fastai/issues/3815. Have you used any custom code (function) during training? Have you tried pip installing ipykernel?

Unreal9er commented 1 year ago

@oguiza Thank you for looking it up, haven’t found such a similar error. No just a basic InceptionTime Model and SlidingWindow data. Yes it’s reinstalling it without any error and the import has no problem either.

I will test if the issue persist if I train the model locally.

Is there another way of loading a model?

oguiza commented 1 year ago

There is a transfer_weights function with the following API:

transfer_weights(model, weights_path:Path, device:torch.device=None, exclude_head:bool=True)

you can recreate the learner and load the weights. But I’m not sure if/ how you’ll be able to access the weights. aslso this approach would only work if you didn’t use any batch_transform that randomly selects data from a batch. For example: TSStandardize() uses the first random batch.

Unreal9er commented 1 year ago

Thank you a lot. I will try it on Monday. Actually I didn’t use any batch transforms so guess that would work

Unreal9er commented 1 year ago

When trying to replicate the error locally, saving and loading the learner directly afterwards, I get the following error for the save_learner() and save_learner_all() functions - maybe it´s the same root problem as with the load_learner():

grafik

When using the pickle_method = dill, the save_learner()-functions work. But the load_learner shows a new error:

grafik

Even though I save and load the learner immediatly afterwards the issue appears and I don´t see how some custom function or class is not available anymore. So the error message isn´t too helpful.

Unreal9er commented 1 year ago

It finally is done!

This is the Imposter:

grafik

The LabelSmoothingCrossEntropyFlat() loss function creates the error when the learner is tried to be saved locally and when you open an exported learner from google colab in the local environment. Possibly all flattened loss functions create that problems, but I only tested it addionally for FocalLossFlat().

@oguiza Thank you a lot for your help and for the work on this package. Should I reopen this in the Issues Tab?

Kind regards

oguiza commented 1 year ago

Hi @Unreal9er, I'm glad that worked out. As to the LabelSmoothingCrossEntropyFlat issue, I think it'd be good to log an issue with fastai. This has nothing to do with tsai. You can try to reproduce the issue using:

import torch
from fastai.losses import LabelSmoothingCrossEntropyFlat

loss_func = LabelSmoothingCrossEntropyFlat()
torch.save(loss_func, "test.pkl")

Depending on the environment you use (and I don't know on what exactly it depends), you may be able to reproduce the issue.

But there's nothing I can do from the tsai library.

Unreal9er commented 1 year ago

Yes that actually reproduces the issue.

I loaded the newest torch version and it showed a different error this time. Seems like an installation issue, a .dll is missing.

Thank you.