Closed Zephyr1022 closed 1 year ago
We enumerate all candidate spans whose lengths are no greater than --max_mention_ori_length. You can adjust it to fit the mention length of you data.
We enumerate all candidate spans whose lengths are no greater than --max_mention_ori_length. You can adjust it to fit the mention length of you data.
Thank you so much for your reply. I have one more question that I want to extract the overlap ner. Do you have any suggestions which model would be better to use, run_acener.py or run_ner.py?
run_acener.py is used for overlap ner. run_ner.py is for the non-overlap ner.
Thanks a lot. I used model run_acener.py to train on my clinical ner data. But the result is not very good. I got the high recall and very low precision. I was curious that does any default hyperparameter would affect the result?
"dev_bestf1": 0.0746615905245347, "f1": 0.07554758410783194, "f1overlap": 0.017645128284329813, "precision": 0.04264850270004909, "recall": 0.3304802662862577
Can you reproduce our result on SciERC dataset in your PC?
Yep, the SciERC dataset works well on my server. But when I apply the same code to my clinical ner data, the f1 is always below 0.3. I tried the different learning rate or models or increasing epoch to 50. The performance is very stable, around 0.3. Here is the sample of my data: I was curious that did I do any wrong to preprocessing the json data?
{"doc_key": "./mimic/03.txt", "sentences": [["SOCIAL", "HISTORY", ":", "Lives", "with", "his", "caring", "and", "devoted", "parents", "at", "home", "."], ["Enjoys", "movies", "and", "computers", "."], ["No", "history", "of", "alcohol", ",", "tobacco", "or", "drug", "use", "."]], "ner": [[[3, 3, "StatusTime"], [4, 9, "TypeLiving"]], [], [[18, 19, "StatusTime"]]], "relations": [[[3, 3, 3, 3, "LivingStatus-Status"], [3, 3, 4, 9, "LivingStatus-Type"]], [], [[23, 23, 18, 19, "Tobacco-Status"], [25, 26, 18, 19, "Drug-Status"], [21, 21, 18, 19, "Alcohol-Status"]]]}
Do you modify the number of labels? https://github.com/thunlp/PL-Marker/blob/07fde08d868134ced1d861d17d263d6c782bb420/run_acener.py#L939-L946
Are the entities in this example: Lives: StatusTime
Yeah, I changed the labels and num_labels in the code.
The entities are as follows: StatusTime: Lives StatusTime: No history TypeLiving: with his caring and devoted parents
Sorry. I have no idea to solve your problem.
Thank you for your time and effort to help me look at it. Something strange may happen in the code on my server.
Hello, I was curious that in the Quick Start section, what does this "--max_mention_ori_length: 8" mean? If I run the different dataset, should I change it based on my data size? Thanks.