mittagessen / kraken

OCR engine for all the languages
http://kraken.re
Apache License 2.0
750 stars 131 forks source link

Suppress warnings from coremltools? #536

Closed stweil closed 1 year ago

stweil commented 1 year ago

In most cases each run of kraken produces two warnings from coremltools. Example:

kraken --alto -o .xml --batch-input "max/*.jpg" segment --model ubma_segmentation.mlmodel --baseline ocr --model desbillons.mlmodel
scikit-learn version 1.2.2 is not supported. Minimum required version: 0.17. Maximum required version: 1.1.2. Disabling scikit-learn conversion API.
Torch version 2.0.1+cu117 has not been tested with coremltools. You may run into unexpected errors. Torch 2.0.0 is the most recent version that has been tested.
[...]

See also the list of other issues with the same warnings.

Are these warnings desired? Or should they better be suppressed?

It looks like commit cda8752b tried to handle this issue, but coremltools is imported first in kraken/lib/layers.py which does not have similar code.

stweil commented 1 year ago

@mittagessen, I tried kraken with a modified kraken/lib/layers.py, similar to your commit for kraken/lib/vgsl.py. That did not suppress the two warnings. I was successful using this patch:

diff --git a/kraken/lib/layers.py b/kraken/lib/layers.py
index 82961eba..d153bf35 100644
--- a/kraken/lib/layers.py
+++ b/kraken/lib/layers.py
@@ -2,13 +2,18 @@
 Layers for VGSL models
 """
 import torch
+import logging
 import numpy as np

 from typing import List, Tuple, Optional, Iterable
 from torch.nn import Module, Sequential
 from torch.nn import functional as F
 from torch.nn.utils.rnn import pad_packed_sequence, pack_padded_sequence
+# filter out coreml warnings coming from their conversion routines (which we don't use).
+logger = logging.getLogger('coremltools')
+logger.setLevel(logging.ERROR)
 from coremltools.proto import NeuralNetwork_pb2
+logger.setLevel(logging.WARNING)

 # all tensors are ordered NCHW, the "feature" dimension is C, so the output of
 # an LSTM will be put into C same as the filters of a CNN.
mittagessen commented 1 year ago

They should be suppressed but I haven't found a good way to filter them out without filtering out all other errors/warnings from the respective modules. I'm a bit loath to utilise the shotgun approach although considering the warning are scary and confuse users it might just be the way to go.