ShannonAI / dice_loss_for_NLP

The repo contains the code of the ACL2020 paper `Dice Loss for Data-imbalanced NLP Tasks`
Apache License 2.0
272 stars 39 forks source link

Link for Preprocessed OntoNotes 5.0 #10

Open Dimiftb opened 3 years ago

Dimiftb commented 3 years ago

Hi,

Thank you very much for your paper and model. I've been trying to replicate your best experimental results on OntoNotes 5.0, however I cannot find the dataset at the link you have provided? Could you please provide link? Thanks.

xiaoya-li commented 3 years ago

Hi, thanks for asking. Please use the link https://drive.google.com/file/d/1qxvccKqkpDJkRJU0OF-i8wNEiiWY5Tm1/view?usp=sharing I will update the link to mrc-for-flat-nested-ner.

DevBey commented 2 years ago

Hi @xiaoya-li ,

thanks for the link, but it seems this data is not preprocessed as it throws error while running dice_loss_for_NLP/scripts/ner_enontonotes5/bert_dice.sh that can't load the json as it's not in the correct format. Can we use dice_loss_for_NLP/tasks/mrc_ner/generate_mrc_dataset.py for conversion to correct format ??

Or is there any other way of doing this ??