Reducing the memory footprint of a scikit-learn text classifier - Max Halford
Context This week at Alan I’ve been working on parsing French medical prescriptions. There are three types of prescriptions: lenses, glasses, and pharmaceutical prescriptions. Different information needs to be extracted depending on the prescription type. Therefore, the first step is to classify the prescription. The prescriptions we receive are pictures taken by users with their phone. We run each image through an OCR to obtain a text transcription of the image.
Awesome Max !
About the importance of '50' and '25' I may have an idea : as the sight correction needed by the patient is written in 0.25 D, you find a lot of '50' '25' in these prescriptions 🙂
Reducing the memory footprint of a scikit-learn text classifier - Max Halford
Context This week at Alan I’ve been working on parsing French medical prescriptions. There are three types of prescriptions: lenses, glasses, and pharmaceutical prescriptions. Different information needs to be extracted depending on the prescription type. Therefore, the first step is to classify the prescription. The prescriptions we receive are pictures taken by users with their phone. We run each image through an OCR to obtain a text transcription of the image.
https://maxhalford.github.io/blog/sklearn-text-classifier-memory-footprint-reduction/