huggingface / neuralcoref

✨Fast Coreference Resolution in spaCy with Neural Networks
https://huggingface.co/coref/
MIT License
2.86k stars 478 forks source link

Load new trained model #257

Open EricLe-dev opened 4 years ago

EricLe-dev commented 4 years ago

Dear guys,

Thank you so much for your interesting works. I was able to train a new model based on this instruction and this blog post. However, I could not find anywhere a manual how to load the trained model.

To understand how the model was loaded using add_to_pipe function, I downloaded the model from this URL and unzipped it. Inside, I could saw the static_vectors and tuned_vectors. I guess those are exactly the like the ones I used to train the model. However, I also see new file which is key2row and I don't know what it is and how to construct this file.

Can some one please give me a small instruction how to do the inference for the trained model? Thank you so much!

FabianKaiser commented 4 years ago

Hello,

So am currently training a german coref model. While the trained model does not yet work (I assume it is a problem with the differences of english and german POS tags), I am able to load it.

The static and tuned_vectors and key2row are not the one you use for the training. Those are spacy vectors. You can create them by saving the spacy vectors:

nlp = spacy.load(model)
nlp.vocab.vectors.to_disk(path)

The vocabulary there should be the same as the on used to create the static and tuned_vocabulary you use for training. I actually wrote a script to create the static and tuned_vocabulary from a german spacy model.

The bigger problem is that as of now you also need the trained model as a thinc model, while it is returned as a torch model. I simply created a new thinc model with the same size and manually adjusted all weights. I do not think this is the right method, but on the fly nothing else worked.

    with Model.define_operators({'**': clone, '>>': chain}):
        single_model = ReLu(h1, SIZE_SINGLE_IN) >> ReLu(h2, h1) >> ReLu(h3, h2) >> Affine(1, h3) >> Affine(1, 1)
        pairs_model = ReLu(h1, SIZE_PAIR_IN) >> ReLu(h2, h1) >> ReLu(h3, h2) >> Affine(1, h3) >> Affine(1, 1)

    tm = torch.load("de_best_modelallpairs")

    #for all layers and both single and pairs model
    pairs_model._layers[0].W = tm["pair_top.0.weight"].cpu().numpy()
    pairs_model._layers[0].b = tm["pair_top.0.bias"].cpu().numpy()

You can then save this model to the disk (maybe simpyl replace the ones in the zip) and load it.

firasfrikha commented 4 years ago

Hello,

So am currently training a german coref model. While the trained model does not yet work (I assume it is a problem with the differences of english and german POS tags), I am able to load it.

The static and tuned_vectors and key2row are not the one you use for the training. Those are spacy vectors. You can create them by saving the spacy vectors:

nlp = spacy.load(model)
nlp.vocab.vectors.to_disk(path)

The vocabulary there should be the same as the on used to create the static and tuned_vocabulary you use for training. I actually wrote a script to create the static and tuned_vocabulary from a german spacy model.

The bigger problem is that as of now you also need the trained model as a thinc model, while it is returned as a torch model. I simply created a new thinc model with the same size and manually adjusted all weights. I do not think this is the right method, but on the fly nothing else worked.

    with Model.define_operators({'**': clone, '>>': chain}):
        single_model = ReLu(h1, SIZE_SINGLE_IN) >> ReLu(h2, h1) >> ReLu(h3, h2) >> Affine(1, h3) >> Affine(1, 1)
        pairs_model = ReLu(h1, SIZE_PAIR_IN) >> ReLu(h2, h1) >> ReLu(h3, h2) >> Affine(1, h3) >> Affine(1, 1)

    tm = torch.load("de_best_modelallpairs")

    #for all layers and both single and pairs model
    pairs_model._layers[0].W = tm["pair_top.0.weight"].cpu().numpy()
    pairs_model._layers[0].b = tm["pair_top.0.bias"].cpu().numpy()

You can then save this model to the disk (maybe simpyl replace the ones in the zip) and load it.

Hello, have tried to load the model after creating the model with thinc, and which version of thinc are you using, because i'am not finding ReLu in thinc i'm using thinc '7.0.8'

FabianKaiser commented 4 years ago

Hello, I am using thinc 7.4

My imports are as follows:

from thinc.v2v import Model, ReLu, Affine
from thinc.api import chain, clone
firasfrikha commented 4 years ago

ok thank you @FabianKaiser for your response, have you tried to load the model ? because i'm getting an error while i'am adding the model to the pipeline. it said "ExtraData: unpack(b) received extra data."

FabianKaiser commented 4 years ago

Hi, Yes sorry. Use the following to add the model to your pipeline.

    nlp = spacy.load("de_core_news_md")

    nc = NeuralCoref(nlp.vocab)
    nc.model = single_model, pairs_model

    nlp.add_pipe(nc, "neuralcoref")
firasfrikha commented 4 years ago

Hello, I am using thinc 7.4

My imports are as follows:

from thinc.v2v import Model, ReLu, Affine
from thinc.api import chain, clone

Hello,

So am currently training a german coref model. While the trained model does not yet work (I assume it is a problem with the differences of english and german POS tags), I am able to load it.

The static and tuned_vectors and key2row are not the one you use for the training. Those are spacy vectors. You can create them by saving the spacy vectors:

nlp = spacy.load(model)
nlp.vocab.vectors.to_disk(path)

The vocabulary there should be the same as the on used to create the static and tuned_vocabulary you use for training. I actually wrote a script to create the static and tuned_vocabulary from a german spacy model.

The bigger problem is that as of now you also need the trained model as a thinc model, while it is returned as a torch model. I simply created a new thinc model with the same size and manually adjusted all weights. I do not think this is the right method, but on the fly nothing else worked.

    with Model.define_operators({'**': clone, '>>': chain}):
        single_model = ReLu(h1, SIZE_SINGLE_IN) >> ReLu(h2, h1) >> ReLu(h3, h2) >> Affine(1, h3) >> Affine(1, 1)
        pairs_model = ReLu(h1, SIZE_PAIR_IN) >> ReLu(h2, h1) >> ReLu(h3, h2) >> Affine(1, h3) >> Affine(1, 1)

    tm = torch.load("de_best_modelallpairs")

    #for all layers and both single and pairs model
    pairs_model._layers[0].W = tm["pair_top.0.weight"].cpu().numpy()
    pairs_model._layers[0].b = tm["pair_top.0.bias"].cpu().numpy()

You can then save this model to the disk (maybe simpyl replace the ones in the zip) and load it.

Hello, when i execute this code :

 pairs_model._layers[0].W = tm["pair_top.0.weight"].cpu().numpy()

I get this error:

`TypeError                                 Traceback (most recent call last)
<ipython-input-13-6be9d2127022> in <module>
----> 1 pairs_model._layers[0].W = tm["pair_top.0.weight"].cpu().numpy()

~/anaconda3/envs/coref1/lib/python3.8/site-packages/thinc/describe.py in __set__(self, obj, val)
   47     def __set__(self, obj, val):
    48         data = obj._mem.get((obj.id, self.name))
---> 49         data[:] = val
    50 
    51 

TypeError: 'NoneType' object does not support item assignment`
FabianKaiser commented 4 years ago

Hello, I am using thinc 7.4 My imports are as follows:

from thinc.v2v import Model, ReLu, Affine
from thinc.api import chain, clone

Hello, So am currently training a german coref model. While the trained model does not yet work (I assume it is a problem with the differences of english and german POS tags), I am able to load it. The static and tuned_vectors and key2row are not the one you use for the training. Those are spacy vectors. You can create them by saving the spacy vectors:

nlp = spacy.load(model)
nlp.vocab.vectors.to_disk(path)

The vocabulary there should be the same as the on used to create the static and tuned_vocabulary you use for training. I actually wrote a script to create the static and tuned_vocabulary from a german spacy model. The bigger problem is that as of now you also need the trained model as a thinc model, while it is returned as a torch model. I simply created a new thinc model with the same size and manually adjusted all weights. I do not think this is the right method, but on the fly nothing else worked.

    with Model.define_operators({'**': clone, '>>': chain}):
        single_model = ReLu(h1, SIZE_SINGLE_IN) >> ReLu(h2, h1) >> ReLu(h3, h2) >> Affine(1, h3) >> Affine(1, 1)
        pairs_model = ReLu(h1, SIZE_PAIR_IN) >> ReLu(h2, h1) >> ReLu(h3, h2) >> Affine(1, h3) >> Affine(1, 1)

    tm = torch.load("de_best_modelallpairs")

    #for all layers and both single and pairs model
    pairs_model._layers[0].W = tm["pair_top.0.weight"].cpu().numpy()
    pairs_model._layers[0].b = tm["pair_top.0.bias"].cpu().numpy()

You can then save this model to the disk (maybe simpyl replace the ones in the zip) and load it.

Hello, when i execute this code :

pairs_model._layers[0].W = tm["pair_top.0.weight"].cpu().numpy()

I get this error: `TypeError Traceback (most recent call last) in ----> 1 pairs_model._layers[0].W = tm["pair_top.0.weight"].cpu().numpy()

~/anaconda3/envs/coref1/lib/python3.8/site-packages/thinc/describe.py in set(self, obj, val) 47 def set(self, obj, val): 48 data = obj._mem.get((obj.id, self.name)) ---> 49 data[:] = val 50 51

TypeError: 'NoneType' object does not support item assignment`

Hey.

I do not really know why this happens, but try calling them once before:

for l in pairs_model._layers + single_model._layers:
        l.W
        l.b
firasfrikha commented 4 years ago

thank you @FabianKaiser for your time and I appreciate your help . i want to ask you if you have tried to load your self trained model and test it ?

firasfrikha commented 4 years ago

Hello, I am using thinc 7.4 My imports are as follows:

from thinc.v2v import Model, ReLu, Affine
from thinc.api import chain, clone

Hello, So am currently training a german coref model. While the trained model does not yet work (I assume it is a problem with the differences of english and german POS tags), I am able to load it. The static and tuned_vectors and key2row are not the one you use for the training. Those are spacy vectors. You can create them by saving the spacy vectors:

nlp = spacy.load(model)
nlp.vocab.vectors.to_disk(path)

The vocabulary there should be the same as the on used to create the static and tuned_vocabulary you use for training. I actually wrote a script to create the static and tuned_vocabulary from a german spacy model. The bigger problem is that as of now you also need the trained model as a thinc model, while it is returned as a torch model. I simply created a new thinc model with the same size and manually adjusted all weights. I do not think this is the right method, but on the fly nothing else worked.

    with Model.define_operators({'**': clone, '>>': chain}):
        single_model = ReLu(h1, SIZE_SINGLE_IN) >> ReLu(h2, h1) >> ReLu(h3, h2) >> Affine(1, h3) >> Affine(1, 1)
        pairs_model = ReLu(h1, SIZE_PAIR_IN) >> ReLu(h2, h1) >> ReLu(h3, h2) >> Affine(1, h3) >> Affine(1, 1)

    tm = torch.load("de_best_modelallpairs")

    #for all layers and both single and pairs model
    pairs_model._layers[0].W = tm["pair_top.0.weight"].cpu().numpy()
    pairs_model._layers[0].b = tm["pair_top.0.bias"].cpu().numpy()

You can then save this model to the disk (maybe simpyl replace the ones in the zip) and load it.

Hello, when i execute this code :

pairs_model._layers[0].W = tm["pair_top.0.weight"].cpu().numpy()

I get this error: TypeError Traceback (most recent call last) in ----> 1 pairs_model._layers[0].W = tm["pair_top.0.weight"].cpu().numpy() ~/anaconda3/envs/coref1/lib/python3.8/site-packages/thinc/describe.py in **set**(self, obj, val) 47 def **set**(self, obj, val): 48 data = obj._mem.get((obj.id, self.name)) ---> 49 data[:] = val 50 51 TypeError: 'NoneType' object does not support item assignment

Hey.

I do not really know why this happens, but try calling them once before:

for l in pairs_model._layers + single_model._layers:
        l.W
        l.b

Thank you @FabianKaiser , Indeed when i execute this code

for l in pairs_model._layers + single_model._layers:
        l.W
        l.b

I don't get that error anymore and i don't know why

FabianKaiser commented 4 years ago

Thank you @FabianKaiser , Indeed when i execute this code

for l in pairs_model._layers + single_model._layers: l.W l.b

I don't get that error anymore and i don't know why

Yeah no idea. I only found out by chance.

firasfrikha commented 4 years ago

thank you @FabianKaiser , I just want to know if you have tried to load your own model and test it ?

FabianKaiser commented 4 years ago

Yes I did. But it didn't really work. I notice the problem already during training (I also opened another issue https://github.com/huggingface/neuralcoref/issues/264 , and while I do some progress in general the training still doesn't work). I am currently working on trying to resolve the training issues, but I feel my linguistic knowledge is somewhat lacking.

firasfrikha commented 4 years ago

I see, well in my case , I have tried to train a model for french language, And then I have tried to add it to add it to the pipeline, but i keep getting some errors, I don't know if there is any one who have succeed to train this model and test it in another language.

firasfrikha commented 4 years ago

After addind the model to nlp pipe line . i get this error:

---------------------------------------------------------------------------
ShapeMismatchError                        Traceback (most recent call last)
<ipython-input-24-ca95ba91260e> in <module>
----> 1 doc = nlp('je')

~/anaconda3/envs/coref1/lib/python3.8/site-packages/spacy/language.py in __call__(self, text, disable, component_cfg)
   400             if not hasattr(proc, "__call__"):
   401                 raise ValueError(Errors.E003.format(component=type(proc), name=name))
--> 402             doc = proc(doc, **component_cfg.get(name, {}))
   403             if doc is None:
   404                 raise ValueError(Errors.E005.format(name=name))

neuralcoref.pyx in neuralcoref.neuralcoref.NeuralCoref.__call__()

neuralcoref.pyx in neuralcoref.neuralcoref.NeuralCoref.predict()

~/anaconda3/envs/coref1/lib/python3.8/site-packages/thinc/neural/_classes/model.py in __call__(self, x)
   167             Must match expected shape
  168         """
--> 169         return self.predict(x)
   170 
   171     def pipe(self, stream, batch_size=128):

~/anaconda3/envs/coref1/lib/python3.8/site-packages/thinc/neural/_classes/feed_forward.py in predict(self, X)
    38     def predict(self, X):
    39         for layer in self._layers:
---> 40             X = layer(X)
    41         return X
    42 

~/anaconda3/envs/coref1/lib/python3.8/site-packages/thinc/neural/_classes/model.py in __call__(self, x)
   167             Must match expected shape
   168         """
--> 169         return self.predict(x)
   170 
   171     def pipe(self, stream, batch_size=128):

~/anaconda3/envs/coref1/lib/python3.8/site-packages/thinc/check.py in checked_function(wrapped, instance, args, kwargs)
   148                 if not isinstance(check, Callable):
   149                     raise ExpectedTypeError(check, ["Callable"])
--> 150                 check(arg_id, fix_args, kwargs)
   151         return wrapped(*args, **kwargs)
   152 

~/anaconda3/envs/coref1/lib/python3.8/site-packages/thinc/check.py in has_shape_inner(arg_id, args, kwargs)
    67             # Allow underspecified dimensions
    68             if dim is not None and arg.shape[i] != dim:
---> 69                 raise ShapeMismatchError(arg.shape, shape_values, shape)
    70 
    71     return has_shape_inner

ShapeMismatchError: 

 Shape mismatch: input (0, 668) not compatible with [None, 3924].

 Traceback:
 ├─ __call__ in /home/equalios/anaconda3/envs/coref1/lib/python3.8/site-packages/spacy/language.py:402
 ├─── __call__ in neural/_classes/model.py:169
 └───── predict in neural/_classes/feed_forward.py:40
        >>> X = layer(X)
FabianKaiser commented 4 years ago

Yes. So this is because your embeddings have a different size when compared to the size you set in the trained model. What did you set

SIZE_EMBEDDING SIZE_WORD SIZE_PAIR_FEATS SIZE_SNGL_FEATS SIZE_GENRE

to during your training? And what did you set them to when usng them right now?

firasfrikha commented 4 years ago

thank you @FabianKaiser for your time, when training the model I set them as follow:

SIZE_SPAN = 1500
SIZE_EMBEDDING = 300
SIZE_WORD = 8
SIZE_MENTION_EMBEDDING = (SIZE_SPAN + SIZE_WORD * SIZE_EMBEDDING)
SIZE_SINGLE_IN = (SIZE_MENTION_EMBEDDING + SIZE_FS)
SIZE_PAIR_IN = (2 * SIZE_MENTION_EMBEDDING + SIZE_FP)

and i haven't changed all the other constants.

when using them now, i haven't changed them, because i didn't know where i need to change them, so i think i have left them as default .

FabianKaiser commented 4 years ago

Then what embeddings did you use to train your model (static/tuned_word_embeddings) and what kind of spacy model do you use them with? Because those embeddingsneed to have the same dimensions (which is why I created the static/tuned_word_embeddings newly based on the spacy model I am using).

firasfrikha commented 4 years ago

I have downloaded word embeddings from fasttext to train my model and i have used "fr_core_news_md" spacy model with them. It may be that these vector embeddding don't have the same shape, because fasttext use a word embedding vector of length 300, but i don't know the length of the vectors that spacy is using.

FabianKaiser commented 4 years ago

To find out the length of spacy try

nlp.vocab.vectors.shape

The second variable should be the length.

firasfrikha commented 4 years ago

the embedding vectors I'am using have the shape (23600,300) the embedding vectors of Spacy have the shape(20000,300) should I use the same embedding vectors of spacy when i'am training the model

firasfrikha commented 4 years ago

the problem is that i'am not getting the meaning of the shape 668 and 3924 that the message error have shown

firasfrikha commented 4 years ago

3924 is the size of SIZE_SINGLE_IN for my training

firasfrikha commented 4 years ago

Sorry @FabianKaiser my my too many questions, but have you tried to add your model to the pipeline and try to use it ? so i can know if this error is related to my work or not. I have successfuly added my model to the pipeline, but when i type this line command for example :

doc=nlp('je m'appelle Firas')

i get the error of size mismatch that have mentioned before.

FabianKaiser commented 4 years ago

Yes sure. So I added it and tried the example sentence (translated to German).

doc = nlp("Meine Schwester hat einen Hund. Sie liebt ihn.")
[Meine Schwester, einen Hund, Sie, ihn]
[0, 1, 2, 3]
[([], {Meine Schwester: {Meine Schwester: 19.200767517089844}, einen Hund: {einen Hund: 4.233463764190674, Meine Schwester: -112.22234344482422}, Sie: {Sie: 14.66589641571045, Meine Schwester: -97.18143463134766, einen Hund: -74.71046447753906}, ihn: {ihn: 17.27436065673828, Meine Schwester: -123.94184875488281, einen Hund: -96.95196533203125, Sie: -93.78512573242188}})]

doc._.has_coref
False

So basically I added some outputs, and you can see that none of the words is referred to the other one (which is wrong). And the output is no corefs (which is also wrong). So while it works in general, there is still a training issue (it might be a bit better now, this is an older model).

FabianKaiser commented 4 years ago

the problem is that i'am not getting the meaning of the shape 668 and 3924 that the message error have shown

So now I am not sure. What did you set

SIZE_SINGLE_IN
SIZE_PAIR_IN
h1, h2, h3

in

    with Model.define_operators({'**': clone, '>>': chain}):
        single_model = ReLu(h1, SIZE_SINGLE_IN) >> ReLu(h2, h1) >> ReLu(h3, h2) >> Affine(1, h3) >> Affine(1, 1)
        pairs_model = ReLu(h1, SIZE_PAIR_IN) >> ReLu(h2, h1) >> ReLu(h3, h2) >> Affine(1, h3) >> Affine(1, 1)

I have

h1 = 1000
h2 = h3 = 500
SIZE_PAIR_IN = 7870
SIZE_SINGLE_IN = 3924

Because the mismatch is when you get the embeddings in the pipeline and try to feed them to your model.

Actually, can you send me the whole code where you add the pipeline to your model?

firasfrikha commented 4 years ago

thank very much @FabianKaiser . So the model load correctly, just you have a problem in the precision and accuracy. I will try to modify the parameters to make mine work. and for the word embedding vector, i think you have used the one of spacy ?

firasfrikha commented 4 years ago

the problem is that i'am not getting the meaning of the shape 668 and 3924 that the message error have shown

So now I am not sure. What did you set

SIZE_SINGLE_IN
SIZE_PAIR_IN
h1, h2, h3

in

    with Model.define_operators({'**': clone, '>>': chain}):
        single_model = ReLu(h1, SIZE_SINGLE_IN) >> ReLu(h2, h1) >> ReLu(h3, h2) >> Affine(1, h3) >> Affine(1, 1)
        pairs_model = ReLu(h1, SIZE_PAIR_IN) >> ReLu(h2, h1) >> ReLu(h3, h2) >> Affine(1, h3) >> Affine(1, 1)

I have

h1 = 1000
h2 = h3 = 500
SIZE_PAIR_IN = 7870
SIZE_SINGLE_IN = 3924

Because the mismatch is when you get the embeddings in the pipeline and try to feed them to your model.

Actually, can you send me the whole code where you add the pipeline to your model?

ok i will send to you the code

firasfrikha commented 4 years ago

can please tell me which version of spacy are you using ,

FabianKaiser commented 4 years ago

Actually the precision is pretty good (99% or sth) but the recall is more or less 5%. I use spacy 2.2.4 I did use the spacy vectors for training (I wrote an extra fucntion to build the vector and vocabulary files, to convert spacy vectors to the required format).

firasfrikha commented 4 years ago

i see. I will try to repeat my work from scratch, can you do me a favor and tell me how can can i transform the spacy vectors to the required format ?

firasfrikha commented 4 years ago

are you training your model on CPU or GPU ?

FabianKaiser commented 4 years ago

Yeah so this is the code I use to transform the sapcy vectors.


    directory = "data/ParCorFull_DE/"
    sub_dirs = ["dev", "train", "test"]

    nlp = spacy.load('de_core_news_md')

    if not os.path.exists(directory + "train/numpy/"):
        os.mkdir(directory + "train/numpy/")

    save_directory = "dev/neuralcoref-master/neuralcoref/train/weights/"

    vocab_filename = "_word_vocabulary.txt"
    word_embeddings = []
    with open(save_directory + "tuned" + vocab_filename, "w", encoding="utf-8", errors="strict") as tuned_vocab_file:
        with open(save_directory + "static" + vocab_filename, "w", encoding="utf-8", errors="strict") as static_vocab_file:
            word_vectors = dict()
            for key, vector in nlp.vocab.vectors.items():
                try:
                    word_string = nlp.vocab.strings[key]
                    tuned_vocab_file.write(word_string + '\n')
                    static_vocab_file.write(word_string + '\n')
                    word_embeddings.append(vector)
                except KeyError:
                    continue

    np.save(save_directory + "tuned_word_embeddings.npy", np.array(word_embeddings))
    np.save(save_directory + "static_word_embeddings.npy", np.array(word_embeddings))

This conversion is also more of a hack.

I am training on GPU.

firasfrikha commented 4 years ago

i'am also training on GPU. i'm trying to train a coreference model for my end of study internship, that's why i'am asking a lot :p here is the full code

SIZE_SPAN = 1500
SIZE_EMBEDDING = 300
SIZE_WORD = 8
SIZE_FS = 24
SIZE_FP = 70

SIZE_MENTION_EMBEDDING = (SIZE_SPAN + SIZE_WORD * SIZE_EMBEDDING)
SIZE_SINGLE_IN = (SIZE_MENTION_EMBEDDING + SIZE_FS)
SIZE_PAIR_IN = (2 * SIZE_MENTION_EMBEDDING + SIZE_FP)

h1 = 1000
h2 = 500
h3 = 500

with Model.define_operators({'**': clone, '>>': chain}):
   single_model = ReLu(h1, SIZE_SINGLE_IN) >> ReLu(h2, h1) >> ReLu(h3, h2) >> Affine(1, h3) >> Affine(1, 1)
   pairs_model = ReLu(h1, SIZE_PAIR_IN) >> ReLu(h2, h1) >> ReLu(h3, h2) >> Affine(1, h3) >> Affine(1, 1)

pairs_model.input_shape
(None, 7870)

single_model._layers[0].W = tm["single_top.0.weight"].cpu().numpy()
single_model._layers[0].b = tm["single_top.0.bias"].cpu().numpy()

single_model.input_shape
(None, 3924)

nlp = spacy.load("fr_core_news_md")
import neuralcoref
nc = neuralcoref.NeuralCoref(nlp.vocab)
nc.model = single_model, pairs_model
nlp.add_pipe(nc, "neuralcoref")
doc = nlp("je suis firas")

when i execute the final line, get the mismatch error

FabianKaiser commented 4 years ago

Ok sorry. I didn't post the whole code up top (cause it is quote a lot).

You need to load the model from disk and manually set all layers.

    with Model.define_operators({'**': clone, '>>': chain}):
        single_model = ReLu(h1, SIZE_SINGLE_IN) >> ReLu(h2, h1) >> ReLu(h3, h2) >> Affine(1, h3) >> Affine(1, 1)
        pairs_model = ReLu(h1, SIZE_PAIR_IN) >> ReLu(h2, h1) >> ReLu(h3, h2) >> Affine(1, h3) >> Affine(1, 1)

    tm = torch.load("models/de_best_modelallpairs")

    for l in pairs_model._layers + single_model._layers:
        l.W
        l.b

    pairs_model._layers[0].W = tm["pair_top.0.weight"].cpu().numpy()
    pairs_model._layers[0].b = tm["pair_top.0.bias"].cpu().numpy()

    pairs_model._layers[1].W = tm["pair_top.3.weight"].cpu().numpy()
    pairs_model._layers[1].b = tm["pair_top.3.bias"].cpu().numpy()

    pairs_model._layers[2].W = tm["pair_top.6.weight"].cpu().numpy()
    pairs_model._layers[2].b = tm["pair_top.6.bias"].cpu().numpy()

    pairs_model._layers[3].W = tm["pair_top.9.weight"].cpu().numpy()
    pairs_model._layers[3].b = tm["pair_top.9.bias"].cpu().numpy()

    pairs_model._layers[4].W = tm["pair_top.10.weight"].cpu().numpy()
    pairs_model._layers[4].b = tm["pair_top.10.bias"].cpu().numpy()

    single_model._layers[0].W = tm["single_top.0.weight"].cpu().numpy()
    single_model._layers[0].b = tm["single_top.0.bias"].cpu().numpy()

    single_model._layers[1].W = tm["single_top.3.weight"].cpu().numpy()
    single_model._layers[1].b = tm["single_top.3.bias"].cpu().numpy()

    single_model._layers[2].W = tm["single_top.6.weight"].cpu().numpy()
    single_model._layers[2].b = tm["single_top.6.bias"].cpu().numpy()

    single_model._layers[3].W = tm["single_top.9.weight"].cpu().numpy()
    single_model._layers[3].b = tm["single_top.9.bias"].cpu().numpy()

    single_model._layers[4].W = tm["single_top.10.weight"].cpu().numpy()
    single_model._layers[4].b = tm["single_top.10.bias"].cpu().numpy()

Try this.

firasfrikha commented 4 years ago

ok i will try it.

firasfrikha commented 4 years ago

thank you very much for your help , but i keep getting the same shape mismatch error. i will look again to all my code from the from the beginning, and try to find the cause of this mismatch error.

firasfrikha commented 4 years ago

After looking again to my code i think there are some constant that i haven't change because the shape of single_model and pair_model are correct (they should be 3924 and 1870) but the input of the neuralcoref is still stuck on 668 ( it is correct if the embedding size is 50) i have changed the constants SIZE_SPAN and SIZE_EMBEDDING in utils.py file to 1500 and 300 . but there are no change

firasfrikha commented 4 years ago

@FabianKaiser are you working with embedding vectors of shape 300 or 50 ? should I , modify something in .neuralcoref_cache folder ? should I modify the files that exist the the weights folder ? I really appreciate your help

FabianKaiser commented 4 years ago

Hey. I must admit I am not sure anymore. I have changed the model in the .neuralcoref_cache. With

    pairs_model.to_disk("custom_model/pairs_model")
    single_model.to_disk("custom_model/single_model")

you can save the created models to disk. Then, you can save the spacy vectors with

nlp = spacy.load(model)
nlp.vocab.vectors.to_disk(path)

Then you can simply overwrite the components of the neuralcoref_cache (keep the cfg). I am not sure however, if that is necessary or just an artifact from me trying to get it running. I did modify all sorts of things (neuralcoref.pyx etc), but I cant say anymore which is needed. Sorry :/

I am also working with 300.

You have to modify the static/tuned vectors/vocabulary files in the weights folder for training. The rest should be created automatically.

firasfrikha commented 4 years ago

@FabianKaiser ,thank very mcuh for your time and your help. the error keep persisting, i will try to train the model another time with the word embbeding provided by spacy. thank you very much and i hope i didn't bother you

firasfrikha commented 4 years ago

@FabianKaiser hello, after searching, I thnik the spacy version is the one that is causing the problem , can you tell me please which version of spacy are you using

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

aditya624 commented 3 years ago

any update for this ?

thank you very much for your help , but i keep getting the same shape mismatch error. i will look again to all my code from the from the beginning, and try to find the cause of this mismatch error.

FabianKaiser commented 3 years ago

Hello, I am very sorry. I did not look into it anymore, because even though I got it running l, the results were catastrophic - probably, because I would have needed to adapt to the correct german Tags, where I had zero knowledge. I did not look into any issues for neuralcoref for a year, so I unfortunately cannot help you here. I can just say, I got there with very lengthely and thorough debugging and in the end the model was not worth it :/