Reducing the memory footprint of a scikit-learn text classifier - Max Halford

Context This week at Alan I’ve been working on parsing French medical prescriptions. There are three types of prescriptions: lenses, glasses, and pharmaceutical prescriptions. Different information needs to be extracted depending on the prescription type. Therefore, the first step is to classify the prescription. The prescriptions we receive are pictures taken by users with their phone. We run each image through an OCR to obtain a text transcription of the image.

https://maxhalford.github.io/blog/sklearn-text-classifier-memory-footprint-reduction/

MaxHalford / maxhalford.github.io

blog/sklearn-text-classifier-memory-footprint-reduction/ #13

Reducing the memory footprint of a scikit-learn text classifier - Max Halford