Reproducing LOS result - Githubissues

Hi,

I am trying to reproduce the results for the LOS task with the MIMIC-III v1.4. However, I can only achieve around 45% accuracy with this dataset. I have tried to reproduce it using your FARM training code and the Hugging Face Trainer. The number of instances in the generated data doesn't seem to match any of the sizes mentioned in the issue.

"https://github.com/bvanaken/clinical-outcome-prediction/issues/11".

My train / val / test dataset sizes are 30421 / 4391 / 8797

I have already set numpy==1.21.0 pandas==1.3.2 nltk==3.6.2

What is the correct size of data for the LOS task?

Best, Jun

bvanaken / clinical-outcome-prediction

Reproducing LOS result #17