Open mnishant2 opened 8 months ago
Can you post the actual errors?
Can you post the actual errors?
There is no error, just doesn't seem to work, a lot of drugs/reasons go undetected after a certain point. Also please confirm that sent_offset
+= (len(line.strip())+1)` not +2 works for n2c2
can you try: https://github.com/uf-hobi-informatics-lab/ClinicalTransformerNER/blob/master/tutorial/pipeline_preprocessing_model_training_prediction.ipynb, we use https://github.com/uf-hobi-informatics-lab/NLPreprocessing for preprocessing which we used for all of our previous works.
also, in n2c2 2018 dataset, some ADE
and Reason
are overlapped, what we did before is we have three copies for drug, ade and reason separately.
lastly, I recommend checking our project which can handle overlap: https://github.com/uf-hobi-informatics-lab/ClinicalTransformerMRC
Thanks that worked, I have questions about the hyperparameter tuning/values needed to reproduce BERT and RoBERTa general results on all the datasets from table 2 in the paper. I am unable to reproduce the exact results. Would be really nice if you could point to that. I have also communicated through email to the corresponding author.
Thanks that worked, I have questions about the hyperparameter tuning/values needed to reproduce BERT and RoBERTa general results on all the datasets from table 2 in the paper. I am unable to reproduce the exact results. Would be really nice if you could point to that. I have also communicated through email to the corresponding author.
how far away? if it is within 0.002, then it should be OK. We use random seed=42, batch size = 4, and learning rate = 1e-5.
Hello, The brat2bio.ipynb does not work for the n2c2 2018 dataset. Do you know if there are any changes needed for it to be useful for n2c2.