3mcloud / MDACE

The project was to build and release the first publicly available code evidence dataset called MDACE on a subset of the MIMIC-III clinical records. We believe that the release of MDACE will greatly improve the understanding and application of deep learning technologies for medical coding, and benefit machine learning models for classification.
MIT License
23 stars 2 forks source link

Some spans are outside the text #7

Closed JoakimEdin closed 10 months ago

JoakimEdin commented 10 months ago

Hi again! I've noticed that some spans are outside the text in the discharge summary. For instance, there is a discharge summary with 13 367 characters, but the evidence spans are between 24 740 and 24 787. Have you concatenated the addendums to the discharge summaries or something?

JoakimEdin commented 10 months ago

I truncate the text. Therefore, there were some evidence outside of the note