Open polm opened 3 years ago
Related issue, cf code in IO tests of https://github.com/explosion/spacy-transformers/pull/277:
nlp.to_disk(file_path)
nlp2 = util.load_model_from_config(orig_config, auto_fill=True, validate=True)
nlp2.initialize(lambda: train_examples)
nlp2.from_disk(file_path)
--> It shouldn't be necessary to call initialize
?
Can this be fixed by implementing an "empty shim" when constructing the Transformer
to avoid the shape mismatch?
Update: the transformers bug at least is fixed by explosion/spacy-transformers#285
Additional repro steps from #9317 to regress the fix
How to reproduce the behaviour
Add a PyTorchWrapper component to a language pipeline and then do this:
Motivating case is covered in https://github.com/explosion/spaCy/discussions/8291. This issue touches code in Thinc and spacy-transformers.
The issue is that the model at construction time has fewer layers than after
initialize
is called. When deserializing Thinc detects this as an issue and throws an error.Your Environment