Closed captainnova closed 6 months ago
Using onnx instead of pickle for I/O of the classifier should help. See the onnx branch, which should support either pickles or onnx, but has onnx as a dependency.
The onnx branch solves the problem and has been incorporated into main.
With python 3.8 I currently see:
RFC_classfier.pickle was made with python 2, and obviously an older version of sklearn. In late 2019 sklearn made a bunch of things, including sklearn.ensemble.forest, private (._forest) in https://github.com/scikit-learn/scikit-learn/issues/9250, despite knowing that it would break pickles (https://github.com/scikit-learn/scikit-learn/issues/12927).
brine then gets the ModuleNotFoundError and misinterprets it as a python 3 vs. 2 error, so the UnicodeDecodeError is misleading and comes from assuming that an unpickling error would be a py3 vs 2 thing. "Assuming" might be too harsh - it's trying to recover from a bad situation, and sometimes it works.
Ideally the classifier weights would be loaded as the data they are, in an inert format like HDF5, instead of a pickle, to avoid these problems.