I've an problem over BBN dataset.

herbertchen1 commented 5 years ago

The original test.json in AFET , use '/FAC' as '/FACILITY' of train.json.

The 12845 version actually dropped the all ''/FAC''(&'/FAC/highway_street'....) mention.....

I'm not sure if the other people also use this setting, which will influence the final result about 2-3%(higher or lower).

Also, there is a slightly different about the type Contact_info, which influence a little

abhipec commented 5 years ago

@herbertchen1 Thanks for sharing a valuable insight. I just checked now that the original training and testing dataset used in AFET paper has different label name assigned to "facilities" entity mentions.

One thing I can assure is that the results reported in Abhishek et al. EACL 2017 paper (table 2, with dagger sign) are on exact same train, dev and test set. In other words the three models compared (AFET (baseline), Attentive (baseline), and Our) use the same sanitized train/dev/test files. I will share the modified code to run those baseline on the same train/dev/test split in few days.

Can you list the other papers that have reported numbers on BBN dataset? I can track down two papers that have also made the code public:

On a quick glance at 1804.08000 it seems to me that the results are reported on a different dev/test split, even different than the setting reported in AFET paper. Will have to ask the authors or check the datasets shared.. Thus I am also not sure about how to compare results across papers.

In a week or two, I will try to do a thorough analysis of the reported results across these papers and prepare a page to list these details. Maybe it will help in reproducing the results in exact similar setting across different papers. Please comment If you have some other observations related to the results reported in related papers, that will surely help in setting up a common benchmarking settings :)

herbertchen1 commented 5 years ago

Thanks for your reply, as far as i observed, ZOE method(D18-1231) use the original test.json file, their method don't need training data and make a map between freebase types to "/FAC" types. As for 1804.08000, it seems using a 93 types BBN , I'm not sure if it's the common dataset. As far as I know, recently only these paper and your paper (A Unified Labeling Approach by Pooling Diverse Datasets for Entity Typing, which is a subtle work)use BBN dataset

herbertchen1 commented 5 years ago

If the result of AFET (baseline), Attentive (baseline) report in your paper are using the same dataset, I think it might be unnecessary to run the result again, it's also fair.

abhipec commented 5 years ago

Thanks for your reply, as far as i observed, ZOE method(D18-1231) use the original test.json file, their method don't need training data and make a map between freebase types to "/FAC" types. As for 1804.08000, it seems using a 93 types BBN , I'm not sure if it's the common dataset. As far as I know, recently only these paper and your paper (A Unified Labeling Approach by Pooling Diverse Datasets for Entity Typing, which is a subtle work)use BBN dataset

Thanks for the info. I wanted to add one point here is that in the Unified Labeling work, the BBN datasets used is directly obtained from LDC (i.e. completely manually annotated) and thus it does not has any of these issues. While on other hand, as far as I know, the BBN version of dataset used in AFET and other followup work is partially annotated using DBpedia spotlight tool.

abhipec commented 5 years ago

If the result of AFET (baseline), Attentive (baseline) report in your paper are using the same dataset, I think it might be unnecessary to run the result again, it's also fair.

The AFET baseline to report the result on the same train/dev/test split is available at: https://github.com/abhipec/AFET

abhipec / fnet

I've an problem over BBN dataset. #5