uf-hobi-informatics-lab / ClinicalTransformerNER

a library for named entity recognition developed by UF HOBI NLP lab featuring SOTA algorithms
MIT License
142 stars 28 forks source link

Performance of XLNet and Longformer? #17

Open drussellmrichie opened 3 years ago

drussellmrichie commented 3 years ago

Thanks again for providing this repository and actively maintaining it. Do you have performance of XLNet and Longformer on the 2010 i2b2 test set, 2012 i2b2 test set, and/or 2018 n2c2 test set readily available and shareable?

bugface commented 3 years ago

We have not got a chance to test Longformer on 2010 i2b2, but the performance should be close to RoBERTa since Longformer is based on RoBERTa.

We ran XLNet on 2010 i2b2, but the performance is relatively low (worse than LSTM-CRFs baseline). We are working on debugging to identify whether it is related to our implementation.

For datasets, you can access them via https://portal.dbmi.hms.harvard.edu/projects/n2c2-nlp. Due to DUA, we cannot host data in this repo.

drussellmrichie commented 3 years ago

Thanks for the update. Look forward to hearing the results of the XLNet debugging.