ahoarfrost / fastBio

Deep learning library for biological sequences. Extension of Fastai and Pytorch.
MIT License
40 stars 6 forks source link

Error when trying fastBio/Tutorial.ipynb #1

Closed frank-chris closed 2 years ago

frank-chris commented 2 years ago

I tried running the section titled I don't want to deal with all the databunch/training stuff. What if I really just want to make a handful of predictions on some data with a pretrained model? from fastBio/Tutorial.ipynb and get an AttributeError: 'LSTM' object has no attribute '_flat_weights_names' in the line in which I called the function load_learner().

Full error message:

/home/chris/.local/lib/python3.8/site-packages/torch/serialization.py:656: SourceChangeWarning: source code of class 'torch.nn.modules.loss.CrossEntropyLoss' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
  warnings.warn(msg, SourceChangeWarning)
/home/chris/.local/lib/python3.8/site-packages/torch/serialization.py:656: SourceChangeWarning: source code of class 'fastai.text.learner.MultiBatchEncoder' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
  warnings.warn(msg, SourceChangeWarning)
/home/chris/.local/lib/python3.8/site-packages/torch/serialization.py:656: SourceChangeWarning: source code of class 'fastai.text.models.awd_lstm.AWD_LSTM' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
  warnings.warn(msg, SourceChangeWarning)
/home/chris/.local/lib/python3.8/site-packages/torch/serialization.py:656: SourceChangeWarning: source code of class 'torch.nn.modules.sparse.Embedding' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
  warnings.warn(msg, SourceChangeWarning)
/home/chris/.local/lib/python3.8/site-packages/torch/serialization.py:656: SourceChangeWarning: source code of class 'fastai.text.models.awd_lstm.EmbeddingDropout' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
  warnings.warn(msg, SourceChangeWarning)
/home/chris/.local/lib/python3.8/site-packages/torch/serialization.py:656: SourceChangeWarning: source code of class 'torch.nn.modules.container.ModuleList' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
  warnings.warn(msg, SourceChangeWarning)
/home/chris/.local/lib/python3.8/site-packages/torch/serialization.py:656: SourceChangeWarning: source code of class 'fastai.text.models.awd_lstm.WeightDropout' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
  warnings.warn(msg, SourceChangeWarning)
/home/chris/.local/lib/python3.8/site-packages/torch/serialization.py:656: SourceChangeWarning: source code of class 'torch.nn.modules.rnn.LSTM' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
  warnings.warn(msg, SourceChangeWarning)
/home/chris/.local/lib/python3.8/site-packages/torch/serialization.py:656: SourceChangeWarning: source code of class 'fastai.text.models.awd_lstm.RNNDropout' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
  warnings.warn(msg, SourceChangeWarning)
/home/chris/.local/lib/python3.8/site-packages/torch/serialization.py:656: SourceChangeWarning: source code of class 'fastai.text.learner.PoolingLinearClassifier' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
  warnings.warn(msg, SourceChangeWarning)
/home/chris/.local/lib/python3.8/site-packages/torch/serialization.py:656: SourceChangeWarning: source code of class 'torch.nn.modules.container.Sequential' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
  warnings.warn(msg, SourceChangeWarning)
/home/chris/.local/lib/python3.8/site-packages/torch/serialization.py:656: SourceChangeWarning: source code of class 'torch.nn.modules.batchnorm.BatchNorm1d' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
  warnings.warn(msg, SourceChangeWarning)
/home/chris/.local/lib/python3.8/site-packages/torch/serialization.py:656: SourceChangeWarning: source code of class 'torch.nn.modules.dropout.Dropout' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
  warnings.warn(msg, SourceChangeWarning)
/home/chris/.local/lib/python3.8/site-packages/torch/serialization.py:656: SourceChangeWarning: source code of class 'torch.nn.modules.linear.Linear' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
  warnings.warn(msg, SourceChangeWarning)
/home/chris/.local/lib/python3.8/site-packages/torch/serialization.py:656: SourceChangeWarning: source code of class 'torch.nn.modules.activation.ReLU' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
  warnings.warn(msg, SourceChangeWarning)
Traceback (most recent call last):                                                                        
  File "looking_glass.py", line 11, in <module>
    func_clf = load_learner(Path('./models').resolve(), 'OxidoreductaseClassifier_export.pkl')
  File "/home/chris/miniconda3/lib/python3.8/site-packages/fastai/basic_train.py", line 605, in load_learner
    res = clas_func(data, model, **state)
  File "/home/chris/miniconda3/lib/python3.8/site-packages/fastai/text/learner.py", line 52, in __init__
    super().__init__(data, model, metrics=metrics, **learn_kwargs)
  File "<string>", line 19, in __init__
  File "/home/chris/miniconda3/lib/python3.8/site-packages/fastai/basic_train.py", line 166, in __post_init__
    self.model = self.model.to(self.data.device)
  File "/home/chris/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 673, in to
    return self._apply(convert)
  File "/home/chris/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 387, in _apply
    module._apply(fn)
  File "/home/chris/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 387, in _apply
    module._apply(fn)
  File "/home/chris/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 387, in _apply
    module._apply(fn)
  [Previous line repeated 2 more times]
  File "/home/chris/.local/lib/python3.8/site-packages/torch/nn/modules/rnn.py", line 184, in _apply
    self._flat_weights = [(lambda wn: getattr(self, wn) if hasattr(self, wn) else None)(wn) for wn in self._flat_weights_names]
  File "/home/chris/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 947, in __getattr__
    raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'LSTM' object has no attribute '_flat_weights_names'

The piece of code I ran:

#download the pretrained oxidoreductase classifier to 'models' folder
import urllib.request
urllib.request.urlretrieve ('https://github.com/ahoarfrost/LookingGlass/releases/download/v1.0/OxidoreductaseClassifier_export.pkl', 
                            'models/OxidoreductaseClassifier_export.pkl')
#load the model (with an empty databunch)
from fastai.text import load_learner
oxido = load_learner(Path('./models').resolve(), 'OxidoreductaseClassifier_export.pkl')

Please let me know if you know a fix for this.

Thanks frank-chris

frank-chris commented 2 years ago

I was able to fix this by downgrading to pytorch 1.2

Please let me know which version of pytorch to use with fastBio.

Thanks frank-chris

frank-chris commented 2 years ago

When I try using 'FunctionalClassifier_export.pkl', I get a ModuleNotFoundError

Code I used:

import urllib.request
urllib.request.urlretrieve('https://github.com/ahoarfrost/LookingGlass/releases/download/v1.0/FunctionalClassifier_export.pkl', 
                            'models/FunctionalClassifier_export.pkl')
from fastai.text import load_learner
from pathlib import Path
func_clf = load_learner(Path('./models').resolve(), 'FunctionalClassifier_export.pkl')

Error Message:

ModuleNotFoundError                       Traceback (most recent call last)

<ipython-input-6-9f3741739752> in <module>()
      4 from pathlib import Path
      5 
----> 6 func_clf = load_learner(Path('./models').resolve(), 'FunctionalClassifier_export.pkl')

2 frames

/usr/local/lib/python3.7/dist-packages/fastai/basic_train.py in load_learner(path, file, test, **db_kwargs)
    596     "Load a `Learner` object saved with `export_state` in `path/file` with empty data, optionally add `test` and load on `cpu`. `file` can be file-like (file or buffer)"
    597     source = Path(path)/file if is_pathlike(file) else file
--> 598     state = torch.load(source, map_location='cpu') if defaults.device == torch.device('cpu') else torch.load(source)
    599     model = state.pop('model')
    600     src = LabelLists.load_state(path, state.pop('data'))

/usr/local/lib/python3.7/dist-packages/torch/serialization.py in load(f, map_location, pickle_module, **pickle_load_args)
    384         f = f.open('rb')
    385     try:
--> 386         return _load(f, map_location, pickle_module, **pickle_load_args)
    387     finally:
    388         if new_fd:

/usr/local/lib/python3.7/dist-packages/torch/serialization.py in _load(f, map_location, pickle_module, **pickle_load_args)
    571     unpickler = pickle_module.Unpickler(f, **pickle_load_args)
    572     unpickler.persistent_load = persistent_load
--> 573     result = unpickler.load()
    574 
    575     deserialized_storage_keys = pickle_module.load(f, **pickle_load_args)

ModuleNotFoundError: No module named 'data'

I suspect that it has to do with the pytorch version being used. Let me know which pytorch version fastBio is made for.

ahoarfrost commented 2 years ago

Try using a pytorch version 1.1.0. This I believe is an issue with fastai dependencies but I think this should resolve it

ahoarfrost commented 2 years ago

Looking at this more closely, I think your first error was due to pytorch version (changing to 1.2.0 should have resolved), but your second error was due to a bug in the LookingGlass release 1.0 where I had exported some of the models from my original repo and not the fastBio python package, so the paths were messed up.

I've re-exported the models and the LookingGlass repo should have functional exported models now. Try re-downloading the FunctionalClassifier and see if load_learner works now.

I also created a newer version of fastBio (0.1.7) which explicitly requires torch version 1.2.0 so you shouldn't have to downgrade after installing fastBio. It seems to work on the systems I've tried it on but If you have time to run your code on a fresh install of fastBio I would appreciate the feedback!

frank-chris commented 2 years ago

Hi,

Yes I am able to use the FunctionalClassifier_export.pkl now without the Module not found error.

I also noticed that installing fastBio was now installing pytorch 1.2.0 as you mentioned.

Thanks a lot for fixing it.

There was another issue I found while trying to use:

framelearn3 = LookingGlassClassifier(data=framedata).load(pretrained_name='ReadingFrameClassifier', 
                                                                  pretrained=True, 
                                                                  pretrained_dir='models')

it works correctly when pretrained_name='ReadingFrameClassifier' but not when pretrained_name='FunctionalClassifier', pretrained_name='OxidoreductaseClassifier' or pretrained_name='OptimalTempClassifier'.

The error for one of the cases (OptimalTempClassifier) is:

RuntimeError                              Traceback (most recent call last)

<ipython-input-11-3614903e034e> in <module>()
     33 rf_clf = LookingGlassClassifier(data=framedata).load(pretrained_name='OptimalTempClassifier', 
     34                                                                   pretrained=True,
---> 35                                                                   pretrained_dir='models')

2 frames

/usr/local/lib/python3.7/dist-packages/fastBio/models.py in load(self, pretrained_name, pretrained, pretrained_dir)
    173 
    174             print('loading pretrained classifier from',str(Path(model_path/Path(pretrained_name+'.pth'))))
--> 175             learn.load(Path(model_path/Path(pretrained_name)).resolve())
    176             learn.freeze()
    177 

/usr/local/lib/python3.7/dist-packages/fastai/basic_train.py in load(self, file, device, strict, with_opt, purge, remove_module)
    270             model_state = state['model']
    271             if remove_module: model_state = remove_module_load(model_state)
--> 272             get_model(self.model).load_state_dict(model_state, strict=strict)
    273             if ifnone(with_opt,True):
    274                 if not hasattr(self, 'opt'): self.create_opt(defaults.lr, self.wd)

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in load_state_dict(self, state_dict, strict)
    843         if len(error_msgs) > 0:
    844             raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
--> 845                                self.__class__.__name__, "\n\t".join(error_msgs)))
    846         return _IncompatibleKeys(missing_keys, unexpected_keys)
    847 

RuntimeError: Error(s) in loading state_dict for SequentialRNN:
    size mismatch for 1.layers.6.weight: copying a param with shape torch.Size([3, 50]) from checkpoint, the shape in current model is torch.Size([6, 50]).
    size mismatch for 1.layers.6.bias: copying a param with shape torch.Size([3]) from checkpoint, the shape in current model is torch.Size([6]).

basically in all 3 cases, I think the shape of the saved weights don't match with the current model.

Thanks, frank-chris

ahoarfrost commented 2 years ago

Are you changing the databunch you're using when trying to load the pretrained model? It expects you to have data with the right number of classes in the labels - so if you're using framedata with the 6 translation frame categories and trying to load OptimalTempClassifier (which has 3 output categories) the matrix sizes won't match and you'll get an error.

If I get a chance soon I can upload some example_data for each of the pretrained classifiers

frank-chris commented 2 years ago

Yeah, you're right. It was because I forgot to do that. Thanks.

I was wondering if there was a way for me to create an empty BioClasDataBunch. I wanted to use the classifiers directly without training, but the fastai exports are available only for 2 of the 4 classifiers.

ahoarfrost commented 2 years ago

Currently there's not, and creating an empty databunch with fastai isn't the easiest either. Probably the easiest solution would be for me to add exported models for the other classifiers so you can use load_learner, I'll add that to my to-do list.

frank-chris commented 2 years ago

Thanks a lot.

I'm closing this issue since the error due to which I started it is now fixed.