shon-otmazgin / fastcoref

MIT License
142 stars 25 forks source link

Training Error: `RuntimeError: Could not infer dtype of NoneType` #39

Open davidberenstein1957 opened 1 year ago

davidberenstein1957 commented 1 year ago

Hi, I was trying to distill a model but it resulted in an error.

import os
from fastcoref import TrainingArgs, CorefTrainer, LingMessCoref

texts = ["My sister has a dog. She loves him. Some like to play football, others like basketball.", "Paul Allen was born on January 21, 1953, in Seattle, Washington, to Kenneth Sam Allen and Edna Faye Allen. Allen attended Lakeside School, a private school in Seattle, where he befriended Bill Gates, two years younger, with whom he shared an enthusiasm for computers. Paul and Bill used a teletype terminal at their high school, Lakeside, to develop their programming skills on several time-sharing computer systems."]

model = LingMessCoref()
model.predict(texts=texts[0], output_file='train_file_with_clusters.jsonlines')

args = TrainingArgs(
    output_dir='test-trainer',
    overwrite_output_dir=True,
    model_name_or_path='roberta-base',
    device="mps",
    # device='cuda:2',
    # epochs=129,
    # logging_steps=100,
    # eval_steps=100
)   # you can control other arguments such as learning head and others.

trainer = CorefTrainer(
    args=args,
    train_file='train_file_with_clusters_save.jsonlines', 
    # nlp=nlp # optional, for custom nlp class from spacy
)
trainer.train()
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[3], line 21
      5 args = TrainingArgs(
      6     output_dir='test-trainer',
      7     overwrite_output_dir=True,
   (...)
     13     # eval_steps=100
     14 )   # you can control other arguments such as learning head and others.
     16 trainer = CorefTrainer(
     17     args=args,
     18     train_file='train_file_with_clusters_save.jsonlines', 
     19     # nlp=nlp # optional, for custom nlp class from spacy
     20 )
---> 21 trainer.train()
     22 # trainer.evaluate(test=True)
     24 trainer.push_to_hub('f-coref-xlm-roberta-base-ontonotes5')

File [~/Documents/programming/open-source/coref-test/.venv/lib/python3.10/site-packages/fastcoref/trainer.py:197](https://file+.vscode-resource.vscode-cdn.net/Users/davidberenstein/Documents/programming/open-source/KeyBERTNER/~/Documents/programming/open-source/coref-test/.venv/lib/python3.10/site-packages/fastcoref/trainer.py:197), in CorefTrainer.train(self)
    195 batch['input_ids'] = torch.tensor(batch['input_ids'], device=self.device)
    196 batch['attention_mask'] = torch.tensor(batch['attention_mask'], device=self.device)
--> 197 batch['gold_clusters'] = torch.tensor(batch['gold_clusters'], device=self.device)
    198 if 'leftovers' in batch:
    199     batch['leftovers']['input_ids'] = torch.tensor(batch['leftovers']['input_ids'], device=self.device)

RuntimeError: Could not infer dtype of NoneType
shon-otmazgin commented 1 year ago

Which version you using? btw I didn't test it with mps, can you upgrade to the latest fastcoref version and run it with cpu?