Open EricLe-dev opened 4 years ago
Hello,
So am currently training a german coref model. While the trained model does not yet work (I assume it is a problem with the differences of english and german POS tags), I am able to load it.
The static and tuned_vectors and key2row are not the one you use for the training. Those are spacy vectors. You can create them by saving the spacy vectors:
nlp = spacy.load(model)
nlp.vocab.vectors.to_disk(path)
The vocabulary there should be the same as the on used to create the static and tuned_vocabulary you use for training. I actually wrote a script to create the static and tuned_vocabulary from a german spacy model.
The bigger problem is that as of now you also need the trained model as a thinc model, while it is returned as a torch model. I simply created a new thinc model with the same size and manually adjusted all weights. I do not think this is the right method, but on the fly nothing else worked.
with Model.define_operators({'**': clone, '>>': chain}):
single_model = ReLu(h1, SIZE_SINGLE_IN) >> ReLu(h2, h1) >> ReLu(h3, h2) >> Affine(1, h3) >> Affine(1, 1)
pairs_model = ReLu(h1, SIZE_PAIR_IN) >> ReLu(h2, h1) >> ReLu(h3, h2) >> Affine(1, h3) >> Affine(1, 1)
tm = torch.load("de_best_modelallpairs")
#for all layers and both single and pairs model
pairs_model._layers[0].W = tm["pair_top.0.weight"].cpu().numpy()
pairs_model._layers[0].b = tm["pair_top.0.bias"].cpu().numpy()
You can then save this model to the disk (maybe simpyl replace the ones in the zip) and load it.
Hello,
So am currently training a german coref model. While the trained model does not yet work (I assume it is a problem with the differences of english and german POS tags), I am able to load it.
The static and tuned_vectors and key2row are not the one you use for the training. Those are spacy vectors. You can create them by saving the spacy vectors:
nlp = spacy.load(model) nlp.vocab.vectors.to_disk(path)
The vocabulary there should be the same as the on used to create the static and tuned_vocabulary you use for training. I actually wrote a script to create the static and tuned_vocabulary from a german spacy model.
The bigger problem is that as of now you also need the trained model as a thinc model, while it is returned as a torch model. I simply created a new thinc model with the same size and manually adjusted all weights. I do not think this is the right method, but on the fly nothing else worked.
with Model.define_operators({'**': clone, '>>': chain}): single_model = ReLu(h1, SIZE_SINGLE_IN) >> ReLu(h2, h1) >> ReLu(h3, h2) >> Affine(1, h3) >> Affine(1, 1) pairs_model = ReLu(h1, SIZE_PAIR_IN) >> ReLu(h2, h1) >> ReLu(h3, h2) >> Affine(1, h3) >> Affine(1, 1) tm = torch.load("de_best_modelallpairs") #for all layers and both single and pairs model pairs_model._layers[0].W = tm["pair_top.0.weight"].cpu().numpy() pairs_model._layers[0].b = tm["pair_top.0.bias"].cpu().numpy()
You can then save this model to the disk (maybe simpyl replace the ones in the zip) and load it.
Hello, have tried to load the model after creating the model with thinc, and which version of thinc are you using, because i'am not finding ReLu in thinc i'm using thinc '7.0.8'
Hello, I am using thinc 7.4
My imports are as follows:
from thinc.v2v import Model, ReLu, Affine
from thinc.api import chain, clone
ok thank you @FabianKaiser for your response, have you tried to load the model ? because i'm getting an error while i'am adding the model to the pipeline. it said "ExtraData: unpack(b) received extra data."
Hi, Yes sorry. Use the following to add the model to your pipeline.
nlp = spacy.load("de_core_news_md")
nc = NeuralCoref(nlp.vocab)
nc.model = single_model, pairs_model
nlp.add_pipe(nc, "neuralcoref")
Hello, I am using thinc 7.4
My imports are as follows:
from thinc.v2v import Model, ReLu, Affine from thinc.api import chain, clone
Hello,
So am currently training a german coref model. While the trained model does not yet work (I assume it is a problem with the differences of english and german POS tags), I am able to load it.
The static and tuned_vectors and key2row are not the one you use for the training. Those are spacy vectors. You can create them by saving the spacy vectors:
nlp = spacy.load(model) nlp.vocab.vectors.to_disk(path)
The vocabulary there should be the same as the on used to create the static and tuned_vocabulary you use for training. I actually wrote a script to create the static and tuned_vocabulary from a german spacy model.
The bigger problem is that as of now you also need the trained model as a thinc model, while it is returned as a torch model. I simply created a new thinc model with the same size and manually adjusted all weights. I do not think this is the right method, but on the fly nothing else worked.
with Model.define_operators({'**': clone, '>>': chain}): single_model = ReLu(h1, SIZE_SINGLE_IN) >> ReLu(h2, h1) >> ReLu(h3, h2) >> Affine(1, h3) >> Affine(1, 1) pairs_model = ReLu(h1, SIZE_PAIR_IN) >> ReLu(h2, h1) >> ReLu(h3, h2) >> Affine(1, h3) >> Affine(1, 1) tm = torch.load("de_best_modelallpairs") #for all layers and both single and pairs model pairs_model._layers[0].W = tm["pair_top.0.weight"].cpu().numpy() pairs_model._layers[0].b = tm["pair_top.0.bias"].cpu().numpy()
You can then save this model to the disk (maybe simpyl replace the ones in the zip) and load it.
Hello, when i execute this code :
pairs_model._layers[0].W = tm["pair_top.0.weight"].cpu().numpy()
I get this error:
`TypeError Traceback (most recent call last) <ipython-input-13-6be9d2127022> in <module> ----> 1 pairs_model._layers[0].W = tm["pair_top.0.weight"].cpu().numpy() ~/anaconda3/envs/coref1/lib/python3.8/site-packages/thinc/describe.py in __set__(self, obj, val) 47 def __set__(self, obj, val): 48 data = obj._mem.get((obj.id, self.name)) ---> 49 data[:] = val 50 51 TypeError: 'NoneType' object does not support item assignment`
Hello, I am using thinc 7.4 My imports are as follows:
from thinc.v2v import Model, ReLu, Affine from thinc.api import chain, clone
Hello, So am currently training a german coref model. While the trained model does not yet work (I assume it is a problem with the differences of english and german POS tags), I am able to load it. The static and tuned_vectors and key2row are not the one you use for the training. Those are spacy vectors. You can create them by saving the spacy vectors:
nlp = spacy.load(model) nlp.vocab.vectors.to_disk(path)
The vocabulary there should be the same as the on used to create the static and tuned_vocabulary you use for training. I actually wrote a script to create the static and tuned_vocabulary from a german spacy model. The bigger problem is that as of now you also need the trained model as a thinc model, while it is returned as a torch model. I simply created a new thinc model with the same size and manually adjusted all weights. I do not think this is the right method, but on the fly nothing else worked.
with Model.define_operators({'**': clone, '>>': chain}): single_model = ReLu(h1, SIZE_SINGLE_IN) >> ReLu(h2, h1) >> ReLu(h3, h2) >> Affine(1, h3) >> Affine(1, 1) pairs_model = ReLu(h1, SIZE_PAIR_IN) >> ReLu(h2, h1) >> ReLu(h3, h2) >> Affine(1, h3) >> Affine(1, 1) tm = torch.load("de_best_modelallpairs") #for all layers and both single and pairs model pairs_model._layers[0].W = tm["pair_top.0.weight"].cpu().numpy() pairs_model._layers[0].b = tm["pair_top.0.bias"].cpu().numpy()
You can then save this model to the disk (maybe simpyl replace the ones in the zip) and load it.
Hello, when i execute this code :
pairs_model._layers[0].W = tm["pair_top.0.weight"].cpu().numpy()
I get this error: `TypeError Traceback (most recent call last) in ----> 1 pairs_model._layers[0].W = tm["pair_top.0.weight"].cpu().numpy()
~/anaconda3/envs/coref1/lib/python3.8/site-packages/thinc/describe.py in set(self, obj, val) 47 def set(self, obj, val): 48 data = obj._mem.get((obj.id, self.name)) ---> 49 data[:] = val 50 51
TypeError: 'NoneType' object does not support item assignment`
Hey.
I do not really know why this happens, but try calling them once before:
for l in pairs_model._layers + single_model._layers:
l.W
l.b
thank you @FabianKaiser for your time and I appreciate your help . i want to ask you if you have tried to load your self trained model and test it ?
Hello, I am using thinc 7.4 My imports are as follows:
from thinc.v2v import Model, ReLu, Affine from thinc.api import chain, clone
Hello, So am currently training a german coref model. While the trained model does not yet work (I assume it is a problem with the differences of english and german POS tags), I am able to load it. The static and tuned_vectors and key2row are not the one you use for the training. Those are spacy vectors. You can create them by saving the spacy vectors:
nlp = spacy.load(model) nlp.vocab.vectors.to_disk(path)
The vocabulary there should be the same as the on used to create the static and tuned_vocabulary you use for training. I actually wrote a script to create the static and tuned_vocabulary from a german spacy model. The bigger problem is that as of now you also need the trained model as a thinc model, while it is returned as a torch model. I simply created a new thinc model with the same size and manually adjusted all weights. I do not think this is the right method, but on the fly nothing else worked.
with Model.define_operators({'**': clone, '>>': chain}): single_model = ReLu(h1, SIZE_SINGLE_IN) >> ReLu(h2, h1) >> ReLu(h3, h2) >> Affine(1, h3) >> Affine(1, 1) pairs_model = ReLu(h1, SIZE_PAIR_IN) >> ReLu(h2, h1) >> ReLu(h3, h2) >> Affine(1, h3) >> Affine(1, 1) tm = torch.load("de_best_modelallpairs") #for all layers and both single and pairs model pairs_model._layers[0].W = tm["pair_top.0.weight"].cpu().numpy() pairs_model._layers[0].b = tm["pair_top.0.bias"].cpu().numpy()
You can then save this model to the disk (maybe simpyl replace the ones in the zip) and load it.
Hello, when i execute this code :
pairs_model._layers[0].W = tm["pair_top.0.weight"].cpu().numpy()
I get this error:
TypeError Traceback (most recent call last) in ----> 1 pairs_model._layers[0].W = tm["pair_top.0.weight"].cpu().numpy() ~/anaconda3/envs/coref1/lib/python3.8/site-packages/thinc/describe.py in **set**(self, obj, val) 47 def **set**(self, obj, val): 48 data = obj._mem.get((obj.id, self.name)) ---> 49 data[:] = val 50 51 TypeError: 'NoneType' object does not support item assignment
Hey.
I do not really know why this happens, but try calling them once before:
for l in pairs_model._layers + single_model._layers: l.W l.b
Thank you @FabianKaiser , Indeed when i execute this code
for l in pairs_model._layers + single_model._layers: l.W l.b
I don't get that error anymore and i don't know why
Thank you @FabianKaiser , Indeed when i execute this code
for l in pairs_model._layers + single_model._layers: l.W l.b
I don't get that error anymore and i don't know why
Yeah no idea. I only found out by chance.
thank you @FabianKaiser , I just want to know if you have tried to load your own model and test it ?
Yes I did. But it didn't really work. I notice the problem already during training (I also opened another issue https://github.com/huggingface/neuralcoref/issues/264 , and while I do some progress in general the training still doesn't work). I am currently working on trying to resolve the training issues, but I feel my linguistic knowledge is somewhat lacking.
I see, well in my case , I have tried to train a model for french language, And then I have tried to add it to add it to the pipeline, but i keep getting some errors, I don't know if there is any one who have succeed to train this model and test it in another language.
After addind the model to nlp pipe line . i get this error:
--------------------------------------------------------------------------- ShapeMismatchError Traceback (most recent call last) <ipython-input-24-ca95ba91260e> in <module> ----> 1 doc = nlp('je') ~/anaconda3/envs/coref1/lib/python3.8/site-packages/spacy/language.py in __call__(self, text, disable, component_cfg) 400 if not hasattr(proc, "__call__"): 401 raise ValueError(Errors.E003.format(component=type(proc), name=name)) --> 402 doc = proc(doc, **component_cfg.get(name, {})) 403 if doc is None: 404 raise ValueError(Errors.E005.format(name=name)) neuralcoref.pyx in neuralcoref.neuralcoref.NeuralCoref.__call__() neuralcoref.pyx in neuralcoref.neuralcoref.NeuralCoref.predict() ~/anaconda3/envs/coref1/lib/python3.8/site-packages/thinc/neural/_classes/model.py in __call__(self, x) 167 Must match expected shape 168 """ --> 169 return self.predict(x) 170 171 def pipe(self, stream, batch_size=128): ~/anaconda3/envs/coref1/lib/python3.8/site-packages/thinc/neural/_classes/feed_forward.py in predict(self, X) 38 def predict(self, X): 39 for layer in self._layers: ---> 40 X = layer(X) 41 return X 42 ~/anaconda3/envs/coref1/lib/python3.8/site-packages/thinc/neural/_classes/model.py in __call__(self, x) 167 Must match expected shape 168 """ --> 169 return self.predict(x) 170 171 def pipe(self, stream, batch_size=128): ~/anaconda3/envs/coref1/lib/python3.8/site-packages/thinc/check.py in checked_function(wrapped, instance, args, kwargs) 148 if not isinstance(check, Callable): 149 raise ExpectedTypeError(check, ["Callable"]) --> 150 check(arg_id, fix_args, kwargs) 151 return wrapped(*args, **kwargs) 152 ~/anaconda3/envs/coref1/lib/python3.8/site-packages/thinc/check.py in has_shape_inner(arg_id, args, kwargs) 67 # Allow underspecified dimensions 68 if dim is not None and arg.shape[i] != dim: ---> 69 raise ShapeMismatchError(arg.shape, shape_values, shape) 70 71 return has_shape_inner ShapeMismatchError: Shape mismatch: input (0, 668) not compatible with [None, 3924]. Traceback: ├─ __call__ in /home/equalios/anaconda3/envs/coref1/lib/python3.8/site-packages/spacy/language.py:402 ├─── __call__ in neural/_classes/model.py:169 └───── predict in neural/_classes/feed_forward.py:40 >>> X = layer(X)
Yes. So this is because your embeddings have a different size when compared to the size you set in the trained model. What did you set
SIZE_EMBEDDING SIZE_WORD SIZE_PAIR_FEATS SIZE_SNGL_FEATS SIZE_GENRE
to during your training? And what did you set them to when usng them right now?
thank you @FabianKaiser for your time, when training the model I set them as follow:
SIZE_SPAN = 1500 SIZE_EMBEDDING = 300 SIZE_WORD = 8 SIZE_MENTION_EMBEDDING = (SIZE_SPAN + SIZE_WORD * SIZE_EMBEDDING) SIZE_SINGLE_IN = (SIZE_MENTION_EMBEDDING + SIZE_FS) SIZE_PAIR_IN = (2 * SIZE_MENTION_EMBEDDING + SIZE_FP)
and i haven't changed all the other constants.
when using them now, i haven't changed them, because i didn't know where i need to change them, so i think i have left them as default .
Then what embeddings did you use to train your model (static/tuned_word_embeddings) and what kind of spacy model do you use them with? Because those embeddingsneed to have the same dimensions (which is why I created the static/tuned_word_embeddings newly based on the spacy model I am using).
I have downloaded word embeddings from fasttext to train my model and i have used "fr_core_news_md" spacy model with them. It may be that these vector embeddding don't have the same shape, because fasttext use a word embedding vector of length 300, but i don't know the length of the vectors that spacy is using.
To find out the length of spacy try
nlp.vocab.vectors.shape
The second variable should be the length.
the embedding vectors I'am using have the shape (23600,300) the embedding vectors of Spacy have the shape(20000,300) should I use the same embedding vectors of spacy when i'am training the model
the problem is that i'am not getting the meaning of the shape 668 and 3924 that the message error have shown
3924 is the size of SIZE_SINGLE_IN for my training
Sorry @FabianKaiser my my too many questions, but have you tried to add your model to the pipeline and try to use it ? so i can know if this error is related to my work or not. I have successfuly added my model to the pipeline, but when i type this line command for example :
doc=nlp('je m'appelle Firas')
i get the error of size mismatch that have mentioned before.
Yes sure. So I added it and tried the example sentence (translated to German).
doc = nlp("Meine Schwester hat einen Hund. Sie liebt ihn.")
[Meine Schwester, einen Hund, Sie, ihn]
[0, 1, 2, 3]
[([], {Meine Schwester: {Meine Schwester: 19.200767517089844}, einen Hund: {einen Hund: 4.233463764190674, Meine Schwester: -112.22234344482422}, Sie: {Sie: 14.66589641571045, Meine Schwester: -97.18143463134766, einen Hund: -74.71046447753906}, ihn: {ihn: 17.27436065673828, Meine Schwester: -123.94184875488281, einen Hund: -96.95196533203125, Sie: -93.78512573242188}})]
doc._.has_coref
False
So basically I added some outputs, and you can see that none of the words is referred to the other one (which is wrong). And the output is no corefs (which is also wrong). So while it works in general, there is still a training issue (it might be a bit better now, this is an older model).
the problem is that i'am not getting the meaning of the shape 668 and 3924 that the message error have shown
So now I am not sure. What did you set
SIZE_SINGLE_IN
SIZE_PAIR_IN
h1, h2, h3
in
with Model.define_operators({'**': clone, '>>': chain}):
single_model = ReLu(h1, SIZE_SINGLE_IN) >> ReLu(h2, h1) >> ReLu(h3, h2) >> Affine(1, h3) >> Affine(1, 1)
pairs_model = ReLu(h1, SIZE_PAIR_IN) >> ReLu(h2, h1) >> ReLu(h3, h2) >> Affine(1, h3) >> Affine(1, 1)
I have
h1 = 1000
h2 = h3 = 500
SIZE_PAIR_IN = 7870
SIZE_SINGLE_IN = 3924
Because the mismatch is when you get the embeddings in the pipeline and try to feed them to your model.
Actually, can you send me the whole code where you add the pipeline to your model?
thank very much @FabianKaiser . So the model load correctly, just you have a problem in the precision and accuracy. I will try to modify the parameters to make mine work. and for the word embedding vector, i think you have used the one of spacy ?
the problem is that i'am not getting the meaning of the shape 668 and 3924 that the message error have shown
So now I am not sure. What did you set
SIZE_SINGLE_IN SIZE_PAIR_IN h1, h2, h3
in
with Model.define_operators({'**': clone, '>>': chain}): single_model = ReLu(h1, SIZE_SINGLE_IN) >> ReLu(h2, h1) >> ReLu(h3, h2) >> Affine(1, h3) >> Affine(1, 1) pairs_model = ReLu(h1, SIZE_PAIR_IN) >> ReLu(h2, h1) >> ReLu(h3, h2) >> Affine(1, h3) >> Affine(1, 1)
I have
h1 = 1000 h2 = h3 = 500 SIZE_PAIR_IN = 7870 SIZE_SINGLE_IN = 3924
Because the mismatch is when you get the embeddings in the pipeline and try to feed them to your model.
Actually, can you send me the whole code where you add the pipeline to your model?
ok i will send to you the code
can please tell me which version of spacy are you using ,
Actually the precision is pretty good (99% or sth) but the recall is more or less 5%. I use spacy 2.2.4 I did use the spacy vectors for training (I wrote an extra fucntion to build the vector and vocabulary files, to convert spacy vectors to the required format).
i see. I will try to repeat my work from scratch, can you do me a favor and tell me how can can i transform the spacy vectors to the required format ?
are you training your model on CPU or GPU ?
Yeah so this is the code I use to transform the sapcy vectors.
directory = "data/ParCorFull_DE/"
sub_dirs = ["dev", "train", "test"]
nlp = spacy.load('de_core_news_md')
if not os.path.exists(directory + "train/numpy/"):
os.mkdir(directory + "train/numpy/")
save_directory = "dev/neuralcoref-master/neuralcoref/train/weights/"
vocab_filename = "_word_vocabulary.txt"
word_embeddings = []
with open(save_directory + "tuned" + vocab_filename, "w", encoding="utf-8", errors="strict") as tuned_vocab_file:
with open(save_directory + "static" + vocab_filename, "w", encoding="utf-8", errors="strict") as static_vocab_file:
word_vectors = dict()
for key, vector in nlp.vocab.vectors.items():
try:
word_string = nlp.vocab.strings[key]
tuned_vocab_file.write(word_string + '\n')
static_vocab_file.write(word_string + '\n')
word_embeddings.append(vector)
except KeyError:
continue
np.save(save_directory + "tuned_word_embeddings.npy", np.array(word_embeddings))
np.save(save_directory + "static_word_embeddings.npy", np.array(word_embeddings))
This conversion is also more of a hack.
I am training on GPU.
i'am also training on GPU. i'm trying to train a coreference model for my end of study internship, that's why i'am asking a lot :p here is the full code
SIZE_SPAN = 1500 SIZE_EMBEDDING = 300 SIZE_WORD = 8 SIZE_FS = 24 SIZE_FP = 70 SIZE_MENTION_EMBEDDING = (SIZE_SPAN + SIZE_WORD * SIZE_EMBEDDING) SIZE_SINGLE_IN = (SIZE_MENTION_EMBEDDING + SIZE_FS) SIZE_PAIR_IN = (2 * SIZE_MENTION_EMBEDDING + SIZE_FP) h1 = 1000 h2 = 500 h3 = 500 with Model.define_operators({'**': clone, '>>': chain}): single_model = ReLu(h1, SIZE_SINGLE_IN) >> ReLu(h2, h1) >> ReLu(h3, h2) >> Affine(1, h3) >> Affine(1, 1) pairs_model = ReLu(h1, SIZE_PAIR_IN) >> ReLu(h2, h1) >> ReLu(h3, h2) >> Affine(1, h3) >> Affine(1, 1) pairs_model.input_shape (None, 7870) single_model._layers[0].W = tm["single_top.0.weight"].cpu().numpy() single_model._layers[0].b = tm["single_top.0.bias"].cpu().numpy() single_model.input_shape (None, 3924) nlp = spacy.load("fr_core_news_md") import neuralcoref nc = neuralcoref.NeuralCoref(nlp.vocab) nc.model = single_model, pairs_model nlp.add_pipe(nc, "neuralcoref") doc = nlp("je suis firas")
when i execute the final line, get the mismatch error
Ok sorry. I didn't post the whole code up top (cause it is quote a lot).
You need to load the model from disk and manually set all layers.
with Model.define_operators({'**': clone, '>>': chain}):
single_model = ReLu(h1, SIZE_SINGLE_IN) >> ReLu(h2, h1) >> ReLu(h3, h2) >> Affine(1, h3) >> Affine(1, 1)
pairs_model = ReLu(h1, SIZE_PAIR_IN) >> ReLu(h2, h1) >> ReLu(h3, h2) >> Affine(1, h3) >> Affine(1, 1)
tm = torch.load("models/de_best_modelallpairs")
for l in pairs_model._layers + single_model._layers:
l.W
l.b
pairs_model._layers[0].W = tm["pair_top.0.weight"].cpu().numpy()
pairs_model._layers[0].b = tm["pair_top.0.bias"].cpu().numpy()
pairs_model._layers[1].W = tm["pair_top.3.weight"].cpu().numpy()
pairs_model._layers[1].b = tm["pair_top.3.bias"].cpu().numpy()
pairs_model._layers[2].W = tm["pair_top.6.weight"].cpu().numpy()
pairs_model._layers[2].b = tm["pair_top.6.bias"].cpu().numpy()
pairs_model._layers[3].W = tm["pair_top.9.weight"].cpu().numpy()
pairs_model._layers[3].b = tm["pair_top.9.bias"].cpu().numpy()
pairs_model._layers[4].W = tm["pair_top.10.weight"].cpu().numpy()
pairs_model._layers[4].b = tm["pair_top.10.bias"].cpu().numpy()
single_model._layers[0].W = tm["single_top.0.weight"].cpu().numpy()
single_model._layers[0].b = tm["single_top.0.bias"].cpu().numpy()
single_model._layers[1].W = tm["single_top.3.weight"].cpu().numpy()
single_model._layers[1].b = tm["single_top.3.bias"].cpu().numpy()
single_model._layers[2].W = tm["single_top.6.weight"].cpu().numpy()
single_model._layers[2].b = tm["single_top.6.bias"].cpu().numpy()
single_model._layers[3].W = tm["single_top.9.weight"].cpu().numpy()
single_model._layers[3].b = tm["single_top.9.bias"].cpu().numpy()
single_model._layers[4].W = tm["single_top.10.weight"].cpu().numpy()
single_model._layers[4].b = tm["single_top.10.bias"].cpu().numpy()
Try this.
ok i will try it.
thank you very much for your help , but i keep getting the same shape mismatch error. i will look again to all my code from the from the beginning, and try to find the cause of this mismatch error.
After looking again to my code i think there are some constant that i haven't change because the shape of single_model and pair_model are correct (they should be 3924 and 1870) but the input of the neuralcoref is still stuck on 668 ( it is correct if the embedding size is 50) i have changed the constants SIZE_SPAN and SIZE_EMBEDDING in utils.py file to 1500 and 300 . but there are no change
@FabianKaiser are you working with embedding vectors of shape 300 or 50 ? should I , modify something in .neuralcoref_cache folder ? should I modify the files that exist the the weights folder ? I really appreciate your help
Hey. I must admit I am not sure anymore. I have changed the model in the .neuralcoref_cache. With
pairs_model.to_disk("custom_model/pairs_model")
single_model.to_disk("custom_model/single_model")
you can save the created models to disk. Then, you can save the spacy vectors with
nlp = spacy.load(model)
nlp.vocab.vectors.to_disk(path)
Then you can simply overwrite the components of the neuralcoref_cache (keep the cfg). I am not sure however, if that is necessary or just an artifact from me trying to get it running. I did modify all sorts of things (neuralcoref.pyx etc), but I cant say anymore which is needed. Sorry :/
I am also working with 300.
You have to modify the static/tuned vectors/vocabulary files in the weights folder for training. The rest should be created automatically.
@FabianKaiser ,thank very mcuh for your time and your help. the error keep persisting, i will try to train the model another time with the word embbeding provided by spacy. thank you very much and i hope i didn't bother you
@FabianKaiser hello, after searching, I thnik the spacy version is the one that is causing the problem , can you tell me please which version of spacy are you using
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
any update for this ?
thank you very much for your help , but i keep getting the same shape mismatch error. i will look again to all my code from the from the beginning, and try to find the cause of this mismatch error.
Hello, I am very sorry. I did not look into it anymore, because even though I got it running l, the results were catastrophic - probably, because I would have needed to adapt to the correct german Tags, where I had zero knowledge. I did not look into any issues for neuralcoref for a year, so I unfortunately cannot help you here. I can just say, I got there with very lengthely and thorough debugging and in the end the model was not worth it :/
Dear guys,
Thank you so much for your interesting works. I was able to train a new model based on this instruction and this blog post. However, I could not find anywhere a manual how to load the trained model.
To understand how the model was loaded using
add_to_pipe
function, I downloaded the model from this URL and unzipped it. Inside, I could saw thestatic_vectors
andtuned_vectors
. I guess those are exactly the like the ones I used to train the model. However, I also see new file which iskey2row
and I don't know what it is and how to construct this file.Can some one please give me a small instruction how to do the inference for the trained model? Thank you so much!