unable to load model after training on collar GPU

Anand195 commented 1 year ago

Hello, guys

Since last few weeks I was facing this issue!

While trying to load the custom trained model on collar GPU onto my Mac , I got this error!

Any suggestions or help to resolve this is appreciated!

Your Environment

  ## Info about spaCy

  - **spaCy version:** 3.5.0
 - **Platform:** Linux-5.10.147+-x86_64-with-glibc2.29
 - **Python version:** 3.8.10
 - **Pipelines:** en_core_web_sm (3.4.1)

rmitsch commented 1 year ago

Hi @Anand195, please include your error messages - and code - as a formatted code block, not as an image. Does the error persist if you run the pipeline with the spaCy version it was trained with (see warning)?

Anand195 commented 1 year ago

Hi @rmitsch, Thanks for quick reply.

Here is the code which I run to load the custom trained relation_extraction model. import spacy nlp = spacy.load("/content/drive/MyDrive/relation-extraction/training/model-last")

Here is the error message I get on colab.

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-5-6168ebbf5b24> in <module>
      1 import spacy
----> 2 spacy.load("/content/drive/MyDrive/relation-extraction/training/model-last")

6 frames
/usr/local/lib/python3.8/dist-packages/spacy/language.py in create_pipe(self, factory_name, name, config, raw_config, validate)
    658                 lang_code=self.lang,
    659             )
--> 660             raise ValueError(err)
    661         pipe_meta = self.get_factory_meta(factory_name)
    662         # This is unideal, but the alternative would mean you always need to

ValueError: [E002] Can't find factory for 'relation_extractor' for language English (en). This usually happens when spaCy calls "nlp.create_pipe" with a custom component name that's not registered on the current language class. If you're using a Transformer, make sure to install 'spacy-transformers'. If you're using a custom component, make sure you've added the decorator `@Language.component` (for function components) or `@Language.factory` (for class components).

Available factories: attribute_ruler, tok2vec, merge_noun_chunks, merge_entities, merge_subtokens, token_splitter, doc_cleaner, parser, beam_parser, lemmatizer, trainable_lemmatizer, entity_linker, ner, beam_ner, entity_ruler, tagger, morphologizer, senter, sentencizer, textcat, spancat, future_entity_ruler, span_ruler, textcat_multilabel, en.lemmatizer

Anand195 commented 1 year ago

Hi @rmitsch, here is the error message I got when I try to load the same model on my m1 Mac.

code to load the model.

import spacy
nlp = spacy.load("/Users/caypro/Downloads/rl-model-last")

Here is the error message.

/Users/caypro/Anand/project/venv/lib/python3.9/site-packages/spacy/util.py:865: UserWarning: [W095] Model 'en_pipeline' (0.0.0) was trained with spaCy v3.5 and may not be 100% compatible with the current version (3.4.1). If you see errors or degraded performance, download a newer compatible model or retrain your custom model with the current spaCy version. For more details and available updates, run: python -m spacy validate
  warnings.warn(warn_msg)
Output exceeds the size limit. Open the full output data in a text editor
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /var/folders/59/lgs8qld16_969qxl0fnb6db00000gn/T/ipykernel_1902/459291777.py:2 in <cell line: 2> │
│                                                                                                  │
│ [Errno 2] No such file or directory:                                                             │
│ '/var/folders/59/lgs8qld16_969qxl0fnb6db00000gn/T/ipykernel_1902/459291777.py'                   │
│                                                                                                  │
│ /Users/caypro/Anand/project/venv/lib/python3.9/site-packages/spacy/__init__.py:54 in load        │
│                                                                                                  │
│   51 │   │   keyed by section values in dot notation.                                            │
│   52 │   RETURNS (Language): The loaded nlp object.                                              │
│   53 │   """                                                                                     │
│ ❱ 54 │   return util.load_model(                                                                 │
│   55 │   │   name,                                                                               │
│   56 │   │   vocab=vocab,                                                                        │
│   57 │   │   disable=disable,                                                                    │
│                                                                                                  │
│ /Users/caypro/Anand/project/venv/lib/python3.9/site-packages/spacy/util.py:431 in load_model     │
│                                                                                                  │
│    428 │   │   if is_package(name):  # installed as package                                      │
│    429 │   │   │   return load_model_from_package(name, **kwargs)  # type: ignore[arg-type]      │
│    430 │   │   if Path(name).exists():  # path to model data directory                           │
│ ❱  431 │   │   │   return load_model_from_path(Path(name), **kwargs)  # type: ignore[arg-type]   │
│    432 │   elif hasattr(name, "exists"):  # Path or Path-like to model data                      │
│    433 │   │   return load_model_from_path(name, **kwargs)  # type: ignore[arg-type]             │
│    434 │   if name in OLD_MODEL_SHORTCUTS:                                                       │
...
│   1276 │   │   │   return f.write(view)                                                          │
│   1277                                                                                           │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
TypeError: memoryview: a bytes-like object is required, not 'dict'

Getting this error since few weeks.

Share your suggestions or solutions to this error.

Thanks in advance!

rmitsch commented 1 year ago

So this is about two different errors on two different machines with the same model, is that correct?

Does the error persist if you run the pipeline with the spaCy version it was trained with (see warning)?

Please try that and report back.

Anand195 commented 1 year ago

@rmitsch Yes, The first error was encounter while trying to load the trained model on colab (import spacy nlp = spacy.load("/content/drive/MyDrive/relation-extraction/training/model-last")) and another error was encountered while trying to same model on Mac (import spacy nlp = spacy.load("/Users/caypro/Downloads/rl-model-last"))

rmitsch commented 1 year ago

The first issue seems like you aren't importing your factory function, hence spaCy's registry not knowing about it. Have a look at this thread and let me know if the approach suggested there works for you.

Anand195 commented 1 year ago

@rmitsch Proposed solution worked for 1st issue ( import spacy nlp = spacy.load("/content/drive/MyDrive/relation-extraction/training/model-last" )

Thanks for quick reply.

Do let me know what can we do about 2nd issue?

rmitsch commented 1 year ago

Have you tried training the model with the same spaCy version you're using when loading it?

Anand195 commented 1 year ago

Yes, I was using the same spacy version for model training as well mode testing/loading.

rmitsch commented 1 year ago

Your screenshot indicates you trained the pipeline with v3.5.0 and loaded it with v3.4.1. Which version(s) exactly are you using?

Anand195 commented 1 year ago

Oh, just got to notice

I have tested by upgrading my local spacy version to v3.5.0

Here is the code which I run:

import spacy
from rel_pipe import make_relation_extractor, score_relations
from rel_model import create_relation_model, create_classification_layer, create_instances, create_tensors
# nlp2 = spacy.load("/Users/caypro/Anand/project/rel_component/training/model-last")
nlp2 = spacy.load("/Users/caypro/Downloads/rl-model-last")

Here is the error which I encounter while trying to run the above code on my local machine.

TypeError                                 Traceback (most recent call last)
/Users/caypro/Anand/project/sparknlp.ipynb Cell 54 in <cell line: 5>()
      3 from rel_model import create_relation_model, create_classification_layer, create_instances, create_tensors
      4 # nlp2 = spacy.load("/Users/caypro/Anand/project/rel_component/training/model-last")
----> 5 nlp2 = spacy.load("/Users/caypro/Downloads/rl-model-last")

File ~/Anand/project/venv/lib/python3.9/site-packages/spacy/__init__.py:54, in load(name, vocab, disable, enable, exclude, config)
     30 def load(
     31     name: Union[str, Path],
     32     *,
   (...)
     37     config: Union[Dict[str, Any], Config] = util.SimpleFrozenDict(),
     38 ) -> Language:
     39     """Load a spaCy model from an installed package or a local path.
     40 
     41     name (str): Package name or model path.
   (...)
     52     RETURNS (Language): The loaded nlp object.
     53     """
---> 54     return util.load_model(
     55         name,
     56         vocab=vocab,
     57         disable=disable,
     58         enable=enable,
...
-> 1274 view = memoryview(data)
   1275 with self.open(mode='wb') as f:
   1276     return f.write(view)

TypeError: memoryview: a bytes-like object is required, not 'dict'

rmitsch commented 1 year ago

Are you using the code from our REL tutorial? Have you made any adjustments to it?

Anand195 commented 1 year ago

Yes, I have followed REL tutorial and I haven't made any adjustments to it.

Anand195 commented 1 year ago

I can successfully load the model trained using TOK2VEC pipeline on CPU with use of the code form REL tutorial, and while trying the same to train using TRANSFORMER pipeline on GPU got the error as desired before.

rmitsch commented 1 year ago

You trained the transformer model on the same machine you're loading it on (your M1 Mac)?

Anand195 commented 1 year ago

I have trained using GPU on colab and trying to load the model with the same configuration on my M1 Mac after downloading form colab.

rmitsch commented 1 year ago

Which spaCy version were you running on Colab and which one on your Mac?

Anand195 commented 1 year ago

I'm using v3.5.0 on colab as well on my Mac.

Anand195 commented 1 year ago

Hi, @rmitsch

Here is the spacy version details present only Mac

Name: spacy
Version: 3.5.0
Summary: Industrial-strength Natural Language Processing (NLP) in Python
Home-page: https://spacy.io
Author: Explosion
Author-email: contact@explosion.ai
License: MIT
Location: /Users/caypro/Anand/project/venv/lib/python3.9/site-packages
Requires: catalogue, cymem, jinja2, langcodes, murmurhash, numpy, packaging, pathy, preshed, pydantic, requests, setuptools, smart-open, spacy-legacy, spacy-loggers, srsly, thinc, tqdm, typer, wasabi
Required-by: en-core-web-lg, en-core-web-trf, spacy-conll, spacy-transformers, textacy

Here is the code I use to load the custom trained model

import spacy
from scripts.rel_pipe import make_relation_extractor, score_relations
from scripts.rel_model import create_relation_model, create_classification_layer, create_instances, create_tensors
relNlp = spacy.load("/Users/caypro/Downloads/rel_model-last")

Here is the error log which I get while trying load the model which was trained using GPU on colab .

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/Users/caypro/Anand/project/rel_component/notebook.ipynb Cell 1 in <cell line: 4>()
      2 from scripts.rel_pipe import make_relation_extractor, score_relations
      3 from scripts.rel_model import create_relation_model, create_classification_layer, create_instances, create_tensors
----> 4 relNlp = spacy.load("/Users/caypro/Downloads/rel_model-last")

File ~/Anand/project/venv/lib/python3.9/site-packages/spacy/__init__.py:54, in load(name, vocab, disable, enable, exclude, config)
     30 def load(
     31     name: Union[str, Path],
     32     *,
   (...)
     37     config: Union[Dict[str, Any], Config] = util.SimpleFrozenDict(),
     38 ) -> Language:
     39     """Load a spaCy model from an installed package or a local path.
     40 
     41     name (str): Package name or model path.
   (...)
     52     RETURNS (Language): The loaded nlp object.
     53     """
---> 54     return util.load_model(
     55         name,
     56         vocab=vocab,
     57         disable=disable,
     58         enable=enable,
...
-> 1274 view = memoryview(data)
   1275 with self.open(mode='wb') as f:
   1276     return f.write(view)

TypeError: memoryview: a bytes-like object is required, not 'dict'

Here is the spacy details present on colab

Name: spacy
Version: 3.5.0
Summary: Industrial-strength Natural Language Processing (NLP) in Python
Home-page: https://spacy.io/
Author: Explosion
Author-email: contact@explosion.ai
License: MIT
Location: /usr/local/lib/python3.8/dist-packages
Requires: catalogue, cymem, jinja2, langcodes, murmurhash, numpy, packaging, pathy, preshed, pydantic, requests, setuptools, smart-open, spacy-legacy, spacy-loggers, srsly, thinc, tqdm, typer, wasabi
Required-by: en-core-web-sm, fastai, spacy-transformers

Here is the code which I use to train the model on colab

%cd /content/drive/MyDrive/relation-extraction
!python3 -m spacy project run train_gpu

Here is the model training logs.

/content/drive/MyDrive/relation-extraction
2023-02-10 12:35:42.448169: I tensorflow/core/util/util.cc:169] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.

================================= train_gpu =================================
Running command: /usr/bin/python3 -m spacy train configs/rel_trf.cfg --output training --paths.train data/train.spacy --paths.dev data/dev.spacy -c ./scripts/custom_functions.py --gpu-id 0
ℹ Saving to output directory: training
ℹ Using GPU: 0

=========================== Initializing pipeline ===========================
[2023-02-10 12:36:09,984] [INFO] Set up nlp object from config
INFO:spacy:Set up nlp object from config
[2023-02-10 12:36:09,997] [INFO] Pipeline: ['transformer', 'relation_extractor']
INFO:spacy:Pipeline: ['transformer', 'relation_extractor']
[2023-02-10 12:36:10,001] [INFO] Created vocabulary
INFO:spacy:Created vocabulary
[2023-02-10 12:36:10,002] [INFO] Finished initializing nlp object
INFO:spacy:Finished initializing nlp object
Downloading (…)lve/main/config.json: 100% 481/481 [00:00<00:00, 77.0kB/s]
Downloading (…)olve/main/vocab.json: 100% 899k/899k [00:00<00:00, 6.35MB/s]
Downloading (…)olve/main/merges.txt: 100% 456k/456k [00:00<00:00, 3.37MB/s]
Downloading (…)/main/tokenizer.json: 100% 1.36M/1.36M [00:00<00:00, 7.71MB/s]
Downloading (…)"pytorch_model.bin";: 100% 501M/501M [00:02<00:00, 199MB/s]
Some weights of the model checkpoint at roberta-base were not used when initializing RobertaModel: ['lm_head.layer_norm.bias', 'lm_head.decoder.weight', 'lm_head.dense.bias', 'lm_head.bias', 'lm_head.layer_norm.weight', 'lm_head.dense.weight']
- This IS expected if you are initializing RobertaModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
[2023-02-10 12:36:42,061] [INFO] Initialized pipeline components: ['transformer', 'relation_extractor']
INFO:spacy:Initialized pipeline components: ['transformer', 'relation_extractor']
✔ Initialized pipeline

============================= Training pipeline =============================
ℹ Pipeline: ['transformer', 'relation_extractor']
ℹ Initial learn rate: 0.0
E    #       LOSS TRANS...  LOSS RELAT...  REL_MICRO_P  REL_MICRO_R  REL_MICRO_F  SCORE 
---  ------  -------------  -------------  -----------  -----------  -----------  ------
  0       0           1.07           1.90         0.32        95.58         0.63    0.01
  6     100          18.96          44.20         0.00         0.00         0.00    0.00
 13     200           0.05           2.39        52.17        21.24        30.19    0.30
 20     300           0.01           2.19         0.00         0.00         0.00    0.00
 26     400           0.03           1.85        44.13        69.91        54.11    0.54
 33     500           0.01           1.26        60.87        24.78        35.22    0.35
 40     600           0.01           1.11        62.96        30.09        40.72    0.41
 46     700           0.00           0.97        53.08        61.06        56.79    0.57
 53     800           0.00           0.91        58.82        44.25        50.51    0.51
 60     900           0.00           0.87        56.76        37.17        44.92    0.45
 66    1000           0.00           0.86        52.42        57.52        54.85    0.55
✔ Saved pipeline to output directory
training/model-last

Here is the code I use to load model on colab.

import spacy
from rel_pipe import make_relation_extractor, score_relations
from rel_model import create_relation_model, create_classification_layer, create_instances, create_tensors
relNlp = spacy.load("/content/drive/MyDrive/relation-extraction/training/model-last")

I can successfully load the model on colab but unable to load the same model on my Mac.

Due to this issue I'm unable to load the model trained using GPU.

Do let me know if anything else is required in-order to resolve this issue.

rmitsch commented 1 year ago

Do let me know if anything else is required in-order to resolve this issue.

Could you provide the complete stack trace? Feel free to attach it as a .txt file, if you prefer.

Anand195 commented 1 year ago

Here is the complete stack trace.

Traceback (most recent call last):
  File "/var/folders/59/lgs8qld16_969qxl0fnb6db00000gn/T/ipykernel_26918/1833266400.py", line 5, in <cell line: 4>
    relNlp = spacy.load("/Users/caypro/Downloads/rel_model-last")
  File "/Users/caypro/Anand/project/venv/lib/python3.9/site-packages/spacy/__init__.py", line 54, in load
    return util.load_model(
  File "/Users/caypro/Anand/project/venv/lib/python3.9/site-packages/spacy/util.py", line 434, in load_model
    return load_model_from_path(Path(name), **kwargs)  # type: ignore[arg-type]
  File "/Users/caypro/Anand/project/venv/lib/python3.9/site-packages/spacy/util.py", line 514, in load_model_from_path
    return nlp.from_disk(model_path, exclude=exclude, overrides=overrides)
  File "/Users/caypro/Anand/project/venv/lib/python3.9/site-packages/spacy/language.py", line 2125, in from_disk
    util.from_disk(path, deserializers, exclude)  # type: ignore[arg-type]
  File "/Users/caypro/Anand/project/venv/lib/python3.9/site-packages/spacy/util.py", line 1352, in from_disk
    reader(path / key)
  File "/Users/caypro/Anand/project/venv/lib/python3.9/site-packages/spacy/language.py", line 2119, in <lambda>
    deserializers[name] = lambda p, proc=proc: proc.from_disk(  # type: ignore[misc]
  File "/Users/caypro/Anand/project/venv/lib/python3.9/site-packages/spacy_transformers/pipeline_component.py", line 419, in from_disk
    util.from_disk(path, deserialize, exclude)
  File "/Users/caypro/Anand/project/venv/lib/python3.9/site-packages/spacy/util.py", line 1352, in from_disk
    reader(path / key)
  File "/Users/caypro/Anand/project/venv/lib/python3.9/site-packages/spacy_transformers/pipeline_component.py", line 393, in load_model
    self.model.from_bytes(mfile.read())
  File "/Users/caypro/Anand/project/venv/lib/python3.9/site-packages/thinc/model.py", line 619, in from_bytes
    return self.from_dict(msg)
  File "/Users/caypro/Anand/project/venv/lib/python3.9/site-packages/thinc/model.py", line 657, in from_dict
    node.shims[i].from_bytes(shim_bytes)
  File "/Users/caypro/Anand/project/venv/lib/python3.9/site-packages/spacy_transformers/layers/hf_shim.py", line 97, in from_bytes
    Path(temp_dir / x).write_bytes(x_bytes)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/pathlib.py", line 1274, in write_bytes
    view = memoryview(data)
TypeError: memoryview: a bytes-like object is required, not 'dict'

if you require anything more than this do let me know.

adrianeboyd commented 1 year ago

I'm pretty sure this error is due to training a model with spacy-transformers v1.2 on colab and then trying to load it with spacy-transformers v1.1 on a different machine.

spacy-transformers v1.2 can load v1.1 and v1.0 models, but not the other way around. This is similar to trying to load a spacy v3.5 model with spacy v3.4, but in that case the warnings and error messages are much clearer.

If you use spacy package to package your model directory, the current spacy-transformers requirement should be included (as spacy-transformers>=1.2.0,<1.3.0), but if you just copy a model directory, then only the spacy version is recorded in the pipeline metadata, but not the requirements or exact versions of any of other packages that provided additional registered functions.

Anand195 commented 1 year ago

Hi @adrianeboyd ,

Thank for the solution, I have checked and spacy-transformers version and it's lower on my machine than that on which the model was trained on colab using GPU.

I have tried to update the version and it worked.

once again, Thanks!

github-actions[bot] commented 1 year ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

explosion / spaCy

unable to load model after training on collar GPU #12233

Your Environment