sanja7s / MedRed

0 stars 1 forks source link

Data prep procedures #1

Open OleksF opened 2 years ago

OleksF commented 2 years ago

Hi! Thanks for the interesting read, Sanja and team! Including the code is very handy as well.

Still, would it be possible to include data preprocessing script(s) in the repo, in the interest of reproducibility? Seems the expected inputs here (in particular in create_flair_corpus.py) are a bit different from the original datasets as distributed by their respective sources. It'd be helpful to have the preprocessing files to minimize potential mistakes in the transformations.

Thanks again.

sanja7s commented 2 years ago

Hi, Oleks! Thanks for the feedback. I have added two files under prep that were used for creating labels. However, this is the tricky code as it dealt with the messy text. Hopefully, it can help you. It has been 3 years since I looked at it.