jonaswinkler / paperless-ng

A supercharged version of paperless: scan, index and archive all your physical documents
https://paperless-ng.readthedocs.io/en/latest/
GNU General Public License v3.0
5.37k stars 355 forks source link

"sklearn/base.py:315: UserWarning" in Docker-Log 0.9.14 #346

Closed andbez closed 3 years ago

andbez commented 3 years ago

Following Warning occurs for every single consumed document:

/usr/local/lib/python3.7/site-packages/sklearn/base.py:315: UserWarning: Trying to unpickle estimator CountVectorizer from version 0.23.2 when using version 0.24.0. This might lead to breaking code or invalid results. Use at your own risk. UserWarning).

Don't know of any problems with paperless. Just to let you know.

jonaswinkler commented 3 years ago

This will automatically go away once paperless updates the classification model, which will happen every hour, but only if your documents have changed, or if the classification_model.pickle file has been deleted.

andbez commented 3 years ago

👍 Thank you.

rknightion commented 3 years ago

This is happening every time I run "document_create_classifier". I thought that that process updated the classification model?

❯ docker-compose run --rm webserver document_create_classifier
Creating paperless_webserver_run ... done
Paperless-ng docker container starting...
Waiting for PostgreSQL to start...
Apply database migrations...
Operations to perform:
  Apply all migrations: admin, auth, authtoken, contenttypes, django_q, documents, paperless_mail, sessions
Running migrations:
  No migrations to apply.
Executing management command document_create_classifier
/usr/local/lib/python3.7/site-packages/sklearn/base.py:315: UserWarning: Trying to unpickle estimator CountVectorizer from version 0.23.2 when using version 0.24.0. This might lead to breaking code or invalid results. Use at your own risk.
  UserWarning)
/usr/local/lib/python3.7/site-packages/sklearn/base.py:315: UserWarning: Trying to unpickle estimator LabelBinarizer from version 0.23.2 when using version 0.24.0. This might lead to breaking code or invalid results. Use at your own risk.
  UserWarning)
/usr/local/lib/python3.7/site-packages/sklearn/base.py:315: UserWarning: Trying to unpickle estimator MLPClassifier from version 0.23.2 when using version 0.24.0. This might lead to breaking code or invalid results. Use at your own risk.
  UserWarning)
❯ docker-compose run --rm webserver document_create_classifier
Creating paperless_webserver_run ... done
Paperless-ng docker container starting...
Waiting for PostgreSQL to start...
Apply database migrations...
Operations to perform:
  Apply all migrations: admin, auth, authtoken, contenttypes, django_q, documents, paperless_mail, sessions
Running migrations:
  No migrations to apply.
Executing management command document_create_classifier
/usr/local/lib/python3.7/site-packages/sklearn/base.py:315: UserWarning: Trying to unpickle estimator CountVectorizer from version 0.23.2 when using version 0.24.0. This might lead to breaking code or invalid results. Use at your own risk.
  UserWarning)
/usr/local/lib/python3.7/site-packages/sklearn/base.py:315: UserWarning: Trying to unpickle estimator LabelBinarizer from version 0.23.2 when using version 0.24.0. This might lead to breaking code or invalid results. Use at your own risk.
  UserWarning)
/usr/local/lib/python3.7/site-packages/sklearn/base.py:315: UserWarning: Trying to unpickle estimator MLPClassifier from version 0.23.2 when using version 0.24.0. This might lead to breaking code or invalid results. Use at your own risk.
  UserWarning)
jonaswinkler commented 3 years ago

It's only updated if there's a change in the data associated with Auto-matching metadata (content changed, documents with Auto-matching metadata added / removed).

There's no abnormal behavior as far as my tests go, so this is fine.

jonaswinkler commented 3 years ago

Added some notes to the trouble shooting section.