umap-project / umap

uMap lets you create maps with OpenStreetMap layers in a minute and embed them in your site.
https://umap-project.org
Other
1.05k stars 213 forks source link

Pickle issue load_ParametricUMAP() #1919

Closed eafpres closed 1 week ago

eafpres commented 1 week ago

Describe the bug

Jun 19 17:36:52 409683c0-aaf6-48ad-9b2b-d7874460547c gunicorn[89838]:   File "/var/app/current/application.py", line 478, in load_stuff
Jun 19 17:36:52 409683c0-aaf6-48ad-9b2b-d7874460547c gunicorn[89838]:     model = load_ParametricUMAP(model_set + '/' + full_name,
Jun 19 17:36:52 409683c0-aaf6-48ad-9b2b-d7874460547c gunicorn[89838]:             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Jun 19 17:36:52 409683c0-aaf6-48ad-9b2b-d7874460547c gunicorn[89838]:   File "/home/user/mambaforge/envs/tensorml/lib/python3.11/site-packages/umap/parametric>
Jun 19 17:36:52 409683c0-aaf6-48ad-9b2b-d7874460547c gunicorn[89838]:     model = pickle.load((open(model_output, "rb")))
Jun 19 17:36:52 409683c0-aaf6-48ad-9b2b-d7874460547c gunicorn[89838]:             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Jun 19 17:36:52 409683c0-aaf6-48ad-9b2b-d7874460547c gunicorn[89838]:   File "/home/user/mambaforge/envs/tensorml/lib/python3.11/site-packages/numba/core/seri>
Jun 19 17:36:52 409683c0-aaf6-48ad-9b2b-d7874460547c gunicorn[89838]:     ctor, states = loads(serialized)
Jun 19 17:36:52 409683c0-aaf6-48ad-9b2b-d7874460547c gunicorn[89838]:                    ^^^^^^^^^^^^^^^^^
Jun 19 17:36:52 409683c0-aaf6-48ad-9b2b-d7874460547c gunicorn[89838]: TypeError: code() argument 13 must be str, not int

To Reproduce Steps to reproduce the behavior: ubuntu 20.04 Python 3.9 umap-learn==0.5.3

1) create an embedding:

  distance = 'sokalsneath'
  op_mix_ratio = 0.3
  embed_dim = 10
  reducer = umap.ParametricUMAP(random_state = 42,
                                transform_seed = 42,
                                n_neighbors = 15,
                                n_epochs = 500,
                                metric = distance,
                                min_dist = 0.0,
                                set_op_mix_ratio = op_mix_ratio,
                                n_components = embed_dim)
  mapper = reducer.fit(model_vectors)
  mapper.save(data_path + '/' + date_prefix + '/' +
              date_prefix + '_umap_mapper.umap')

2) attempt to load the model on a different linux machine using load_ParametricUMAP() 3)

Jun 19 17:36:52 409683c0-aaf6-48ad-9b2b-d7874460547c gunicorn[89838]:   File "/var/app/current/application.py", line 478, in load_stuff
Jun 19 17:36:52 409683c0-aaf6-48ad-9b2b-d7874460547c gunicorn[89838]:     model = load_ParametricUMAP(model_set + '/' + full_name,
Jun 19 17:36:52 409683c0-aaf6-48ad-9b2b-d7874460547c gunicorn[89838]:             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Jun 19 17:36:52 409683c0-aaf6-48ad-9b2b-d7874460547c gunicorn[89838]:   File "/home/user/mambaforge/envs/tensorml/lib/python3.11/site-packages/umap/parametric>
Jun 19 17:36:52 409683c0-aaf6-48ad-9b2b-d7874460547c gunicorn[89838]:     model = pickle.load((open(model_output, "rb")))
Jun 19 17:36:52 409683c0-aaf6-48ad-9b2b-d7874460547c gunicorn[89838]:             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Jun 19 17:36:52 409683c0-aaf6-48ad-9b2b-d7874460547c gunicorn[89838]:   File "/home/user/mambaforge/envs/tensorml/lib/python3.11/site-packages/numba/core/seri>
Jun 19 17:36:52 409683c0-aaf6-48ad-9b2b-d7874460547c gunicorn[89838]:     ctor, states = loads(serialized)
Jun 19 17:36:52 409683c0-aaf6-48ad-9b2b-d7874460547c gunicorn[89838]:                    ^^^^^^^^^^^^^^^^^
Jun 19 17:36:52 409683c0-aaf6-48ad-9b2b-d7874460547c gunicorn[89838]: TypeError: code() argument 13 must be str, not int

Expected behavior On another machine this worked. I believe it is a subtle pickle issue. I had issues with other pickle files, which was solved by using pickle.dump(object, open(filename), protocol = 2). I have not figured out how to get umap to use the protocol.

Desktop (please complete the following information):

davidbgk commented 1 week ago

@eafpres I guess you're looking for https://github.com/lmcinnes/umap 😅

eafpres commented 1 week ago

Facepalm