Living-with-machines / TargetedSenseDisambiguation

Repository for the work on Targeted Sense Disambiguation
MIT License
1 stars 0 forks source link

check if there is a bug in binarize/quotation filtering related to the time filter #112

Closed kasparvonbeelen closed 3 years ago

kasparvonbeelen commented 3 years ago

in binarize for sense 'machine_nn01-38475835'

start=1700, end=1920

results in no quotations

start=1700, end=1950

results in 144 quotations

start=1760, end=1950

results in 120 quotations?

kasparvonbeelen commented 3 years ago

maybe time filter for train set should only apply to the life span of the senses and test set to when quotations are written?

mcollardanuy commented 3 years ago

Do you want me to have a look at this @kasparvonbeelen?

kasparvonbeelen commented 3 years ago

Yes, I was planning to. I think I know what the bug is, but need to have a look. The solution may be trickier, but maybe we can have a chat about this once if find out what is going wrong?

mcollardanuy commented 3 years ago

Ah, great, sure! Let me know if I can help.

kasparvonbeelen commented 3 years ago

No bug as far as I can. Just the number of negative examples varies a lot depending on the selected date range. I did change the binarize function, will report this in #125 .