richardpaulhudson / coreferee

Coreference resolution for English, French, German and Polish, optimised for limited training data and easily extensible for further languages
MIT License
102 stars 16 forks source link

Finetuning on my own data #19

Closed Tanmay98 closed 1 year ago

Tanmay98 commented 1 year ago

Hi @richardpaulhudson,

Earlier I tried training my custom NER spacy model on Litbank dataset, which was working. But when I tried training on my own own data, it seems that coref_chains attribute doesnt mark any text to true. Can you help me? How can I proceed? I have attacehd the self annotated sample dataset too, can you check if that is alright?

Thanks in advance! (Link to custom dataset) https://drive.google.com/drive/folders/1WzRogtvg81TMCHmVR0Kw4iqrbVWCFgO7?usp=sharing

Tanmay98 commented 1 year ago

Screenshot 2022-12-20 at 11 39 55 AM

This is the mentioned_labels_to_span_sets dictionary that is being parsed through RulesAnalyzerFactory Class, but I think it doesnt set any attributes to true for training. Does this mean I have to add new rules for ACT entity since I already added a noun phrase dictionary as you mentioned before? Am I doing something wrong?

Thanks for your jelp!