allenai / allennlp

An open-source NLP research library, built on PyTorch.
http://www.allennlp.org
Apache License 2.0
11.72k stars 2.25k forks source link

Implementation of Weighted CRF Tagger (handling unbalanced datasets) #5676

Closed eraldoluis closed 2 years ago

eraldoluis commented 2 years ago

Closes #4619 .

Dependency of allennlp-models PR #341

Changes proposed in this pull request:

Before submitting

After submitting

epwalsh commented 2 years ago

Hi @eraldoluis, thanks for this! I may not have time for a thorough review this week but this will be a priority next week.

eraldoluis commented 2 years ago

Hi @eraldoluis, thanks for this! I may not have time for a thorough review this week but this will be a priority next week.

Thank you, @epwalsh !

eraldoluis commented 2 years ago

Thank you a lot @epwalsh for the effort you put on this.

I tried to address your first concerns. Let me know what you think about my changes.

I am looking forward to your feedback regarding the whole thing. Let me know if you have any questions. I will be happy to discuss this further if necessary.

epwalsh commented 2 years ago

Thanks for the quick responses/fixes! Changes look good. I should clarify what I meant by:

create a new folder allennlp/modules/conditional_random_field/ and move the 3 implementations into there.

Looks like you left allennlp/modules/conditional_random_field.py where it is, and then moved the weighted CRFs into allennlp/modules/conditional_random_field_weighted/. I'd rather have a single submodule (folder) called allennlp/modules/conditional_random_field/ with all of the CRFs (included the non-weighted base class).

epwalsh commented 2 years ago

I liked your blog post a lot by the way!

eraldoluis commented 2 years ago

Looks like you left allennlp/modules/conditional_random_field.py where it is, and then moved the weighted CRFs into allennlp/modules/conditional_random_field_weighted/. I'd rather have a single submodule (folder) called allennlp/modules/conditional_random_field/ with all of the CRFs (included the non-weighted base class).

Yes. I was unsure at first. But now I renamed the module to conditional_random_field and moved the original class to it. I also updated the changelog, which I had forgotten.

I also updated allennlp-models to reflect the new module organization. Unfortunately, I pushed first to the allennlp repository and the Model Tests failed (because allennlp-models was outdated). But these tests should pass now.

Let me know what do you think.

eraldoluis commented 2 years ago

Thank you very much, @epwalsh and @dirkgr ! This was my first contribution for an open source project and it was quite fun. I will definitely try it again soon. :)