Essay questions - Githubissues

zzz-python commented 5 months ago

To our knowledge, automatic ICD coding exploits discourse structure and coordination codes in 22 years of coling Embedded Difference was the first clinical text to propose the construct. How do you deal with them? In addition, can't KEPT be run using 3090?

LuChang-CS commented 5 months ago

Hi, thank you for pointing out this. For baselines, we mainly follow the MSMN paper. The DiscNet in your mentioned paper did use the structure information. However, as said in their paper, they used regular expressions to locate the headings. In their provided code, they have a pre-defined headings set which contains 100 headings.

In our paper, the difference is that we automatically extract headings without human effort for parsing. As we mentioned in our paper: our work is one of the first to investigate automatic semi-structured segmentation for clinical notes. Additionally, we incorporate the structured information in pre-training, instead of adding additional embedding for flexibility and generalizability. I believe these are the main differences. But you are right, it is a good baseline.

For KEPT, we did encounter issues of high GPU memory consumption and long running time.

I hope these can answer your questions.

zzz-python commented 5 months ago

oh,thank you very mach, I hope your work will get better and better.

LuChang-CS / semi-structured-icd-coding

Essay questions #1