Closed haroonhassan closed 5 years ago
I am trying to use the spacy bert model to pretrain. I downloaded the model with:
pip install spacy-pytorch-transformers
spacy download en_pytt_bertbaseuncased_lg
I then used the following command:
spacy pretrain data.jsonl en_pytt_bertbaseuncased_lg -o temp
I got the following trace:
✔ Saved settings to config.json
✔ Loaded input texts
✔ Loaded model 'en_pytt_bertbaseuncased_lg'
============== Pre-training tok2vec layer - starting at epoch 0 ==============
# # Words Total Loss Loss w/s
/home/haroon/miniconda3/envs/ssl/lib/python3.6/site-packages/spacy/cli/pretrain.py:317: RuntimeWarning: invalid value encountered in true_divide
cosine = (yh * y).sum(axis=1, keepdims=True) / mul_norms
Traceback (most recent call last):
File "/home/haroon/miniconda3/envs/ssl/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/home/haroon/miniconda3/envs/ssl/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/haroon/miniconda3/envs/ssl/lib/python3.6/site-packages/spacy/__main__.py", line 35, in <module>
plac.call(commands[command], sys.argv[1:])
File "/home/haroon/miniconda3/envs/ssl/lib/python3.6/site-packages/plac_core.py", line 328, in call
cmd, result = parser.consume(arglist)
File "/home/haroon/miniconda3/envs/ssl/lib/python3.6/site-packages/plac_core.py", line 207, in consume
return cmd, self.func(*(args + varargs + extraopts), **kwargs)
File "/home/haroon/miniconda3/envs/ssl/lib/python3.6/site-packages/spacy/cli/pretrain.py", line 218, in pretrain
model, docs, optimizer, objective=loss_func, drop=dropout
File "/home/haroon/miniconda3/envs/ssl/lib/python3.6/site-packages/spacy/cli/pretrain.py", line 247, in make_update
backprop(gradients, sgd=optimizer)
File "/home/haroon/miniconda3/envs/ssl/lib/python3.6/site-packages/spacy/_ml.py", line 759, in mlm_backward
return backprop(d_output, sgd=sgd)
File "/home/haroon/miniconda3/envs/ssl/lib/python3.6/site-packages/thinc/neural/_classes/feed_forward.py", line 53, in continue_update
gradient = callback(gradient, sgd)
File "/home/haroon/miniconda3/envs/ssl/lib/python3.6/site-packages/thinc/neural/_classes/affine.py", line 67, in finish_update
self.ops.gemm(grad__BO, input__BI, trans1=True, out=self.d_W)
File "ops.pyx", line 422, in thinc.neural.ops.NumpyOps.gemm
ValueError: Buffer and memoryview are not contiguous in the same dimension.
Help would be much appreciated.
Operating System: Ubuntu Python Version Used: 3.6.7 spaCy Version Used: 2.1.8 Environment Information: conda
I'm not sure what exactly you're trying to do or achieve, but the problem here is that spacy-pytorch-transformers
and spacy pretrain
are two very different things.
spacy-pytroch-transformers
lets you use pre-trained embeddings like the various BERT models in spaCy to train downstream models (we currently have a custom implementation for text classification) or for similarity comparisons.
spacy pretrain
lets you create similar pre-trained embeddings using word vectors and raw text. The mechanism is similar to the BERT/ELMo/ULMFiT approach, but instead of predicting the next word etc., it's predicting the vector. At the end of it, you get pretrained embeddings that you can use to train your model.
So it currently doesn't make sense to use the BERT embeddings for pretraining. In theory, it could be possible to use those embeddings instead of the word vectors, but that's an open research question.
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
How to reproduce the problem
Your Environment