Closed jayant-yadav closed 11 months ago
Hello!
This warning is only thrown when document-level context is present during evaluation though, i.e. when:
So I think the old warning was correct? If the model was trained without document-level context and the evaluation is also without document-level context, then we don't need a warning I think. But please let me know if I'm overlooking something!
I guess so. What confused me were the line 243 and 255. One says performance would decrease if document-level context was provided and other says the opposite. My understanding was that the absence of document-level context will decrease the performance in any case. But maybe its not like that. Please close this issue if you think that's the case.
Thank you for the quick response though! I got interested in your Master Thesis work since mine was along the same lines but with Biomedical/Clinical data and so I could not release the trained models in open domains.
I guess so. What confused me were the line 243 and 255.
I see! That's an easy mistake. The two warnings are to warn about the two cases where the model sees different type of data during training as it sees during evaluation, e.g.: | Document-level context during evaluation | No document-level context during evaluation | |
---|---|---|---|
Document-level context during training | All good! No warnings | Warning from 255! | |
No document-level context during training | Warning from 243! | All good! No warnings |
And I'm glad I'm not the only one who fancies NER enough to write a thesis about it! It's a fascinating task in my opinion. And biomedical/clinical data uses are very important! I trained a few example models on public data here, in case you're curious.
@tomaarsen Thank you for the clarification. I will close this issue since this is not a valid one.
Without document - level context, ie., in the absence of
document_id
andsentence_id
should throw the warning that evaluation without these metadata will decrease the performance.