Closed drzraf closed 1 year ago
Just claiming memory use is excessive because of encoded JPG image and model weight size is ... weird .. to say the least. You're not going to get the segmenter to work on a GPU with much less than 10GB GPU memory no matter what kind of optimization (or framework) you're running. LSTMs use memory, that's just how it is.
Luckily, ~50% of the segmentation time is actually spent in post-processing so the speedup provided by GPUs is rather limited in any case. The same is true for recognition inference. You might just be able to fit recognition training on your small GPU but it will be a tight fit.
10GB is unrealistic for most humans. At least should this limitation/prerequisite be clearly mentioned in the README/docs. Let's hope PyTorch 2.0 will decrease this amount. On a common laptop, recognition takes up to 40 seconds which is suitable for a dozen images at most. Said otherwise, batch OCR-ing thousands of images is still out of question.
I'd expect that using a GPU (even a older nvidia one, like a 2GB GM107) is in all case superior to plain-CPU usage. Sadly, It seems that the current implementation doesn't support delegating jobs to GPU in case less than 2GB are available.
I tried multiples settings to the (underdocumented)
PYTORCH_CUDA_ALLOC_CONF
(likemax_split_size_mb:64,roundup_bypass_threshold_mb:64
but something as simple as:always lead to an
OutOfMemoryError
:Memory increase follow the follow pattern:
It sounds overly exaggerated (and prejudicial) in order to segment/ocr a 1.7MB JPG and a 16MB model. In the integration with torch isn't there room for tweaking the job/batch/memory size so that (common) GPUs could be used?