shon-otmazgin / fastcoref

MIT License
149 stars 26 forks source link

Predicting on trained model: errors #32

Closed ianbstewart closed 1 year ago

ianbstewart commented 1 year ago

When I train a model on a toy dataset and then try to predict coreference, I get a strange error related to the char_map of the CorefResult object. Full example below:

prepare data

from fastcoref import FCoref, LingMessCoref model = FCoref(device='cpu') texts = ['We are so happy to see you using our coref package. This package is very fast!'] preds = model.predict(texts=texts, output_file='train_file_with_clusters.jsonlines')

training

from fastcoref import TrainingArgs, CorefTrainer args = TrainingArgs( output_dir='fast_coref_fine_tune', overwrite_output_dir=True, model_name_or_path='distilroberta-base', device='cpu', epochs=1, logging_steps=1, eval_steps=1, ) # you can control other arguments such as learning head and others.

trainer = CorefTrainer( args=args, train_file='train_file_with_clusters.jsonlines', dev_file='train_file_with_clusters.jsonlines', ) trainer.train() trainer.evaluate(test=True)

test model

out_dir = 'fast_coref_fine_tune/model/' trained_model = FCoref(model_name_or_path=out_dir, device='cpu') preds = trained_model.predict( texts=['I went to the store and saw my friend, then I saw her later at the cinema'], ) print(preds[0])

Error:

TypeError Traceback (most recent call last) File :1 ----> 1 print(preds[0])

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/fastcoref/modeling.py:63, in CorefResult.str(self) 61 else: 62 text_to_print = self.text ---> 63 return f'CorefResult(text="{text_to_print}", clusters={self.get_clusters()})'

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/fastcoref/modeling.py:41, in CorefResult.get_clusters(self, as_strings) 38 if not as_strings: 39 return [[self.char_map[mention][1] for mention in cluster] for cluster in self.clusters] ---> 41 return [[self.text[self.char_map[mention][1][0]:self.char_map[mention][1][1]] for mention in cluster] 42 for cluster in self.clusters]

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/fastcoref/modeling.py:41, in (.0) 38 if not as_strings: 39 return [[self.char_map[mention][1] for mention in cluster] for cluster in self.clusters] ---> 41 return [[self.text[self.char_map[mention][1][0]:self.char_map[mention][1][1]] for mention in cluster] 42 for cluster in self.clusters]

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.9/site-packages/fastcoref/modeling.py:41, in (.0) 38 if not as_strings: 39 return [[self.char_map[mention][1] for mention in cluster] for cluster in self.clusters] ---> 41 return [[self.text[self.char_map[mention][1][0]:self.char_map[mention][1][1]] for mention in cluster] 42 for cluster in self.clusters]

TypeError: 'NoneType' object is not subscriptable

shon-otmazgin commented 1 year ago

hello @ianbstewart I was able to reproduce it. will debug it today/tmrw Thanks!

shon-otmazgin commented 1 year ago

OK I think I fixed it.

since the model just trained on 1 example, it failed to create spans that "really" coffering, some of the spans the model suggested were the entire text and even spans with a special token and there raised an Exception.

I released a new version to PyPi version 2.1.5

you can do pip install fastcoref==2.1.5

Let me know if it helped

ianbstewart commented 1 year ago

Yup, it works now thanks! New output from the same code above:

pred = CorefResult(text="I went to the store and saw my friend, then I saw ...", clusters=[])