afshinrahimi / mmner

Massively Multilingual Transfer for NER
Apache License 2.0
85 stars 9 forks source link

I already have target sentences and annotated labels. #2

Closed huzechuan closed 4 years ago

huzechuan commented 4 years ago

Hi, I already have the target sentences and annotated labels. I just want to change my file format to run BEA, which means I can't just run: python main.py -m uaggexport -dir_input mydata -dir_output outputdata So my problem is what format should my data be to run the code above. Or is there another way to change my file to the format of the raw data which you provide in the bea_code folder? BTW, my file format is, WORD SOURCE1 SOURCE2 SOURCE3 GOLD Bill O B-PER B-PER B-PER Gates O I-PER I-PER I-PER

yuan-li commented 4 years ago

Hi,

Do you want BEA to infer the true labels unsupervisedly (no gold label will be used), or allow it to use a few gold labels to estimate the reliability of different sources? Are there just 3 sources or possibly more than 3?

I can prepare a notebook next week to run BEA on your file format, so that you don't need to change it :)

huzechuan commented 4 years ago

It's very kind of you. I already fix it. Thanks a lot.