stanfordmlgroup / chexpert-labeler

CheXpert NLP tool to extract observations from radiology reports.
MIT License
351 stars 79 forks source link

Out of memory #37

Open catfish132 opened 2 years ago

catfish132 commented 2 years ago

my csv is so big that the memory is out of the range. May be there is a way to fix out

jirvin16 commented 2 years ago

my csv is so big that the memory is out of the range. May be there is a way to fix out

Maybe try reading the CSV in chunks, running the labeler on each chunk, then writing the output iteratively.

BardiaKh commented 1 year ago

I faced the same issue as I was processing 1.5M reports. I created a set of scripts to chunk a large file and process chunks in parallel automatically. This might be useful for others who see this so they don't have to reinvent the wheel:

https://github.com/BardiaKh/ParallelCheXpertLabeler