unum-cloud / uform

Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and 🔜 video, up to 5x faster than OpenAI CLIP and LLaVA 🖼️ & 🖋️
https://unum-cloud.github.io/uform/
Apache License 2.0
1.05k stars 62 forks source link

Can't load model uform-vl-multilingual-v2 #90

Closed polm-stability closed 2 months ago

polm-stability commented 3 months ago

When trying to load the unum-cloud/uform-vl-multilingual-v2 model using the example code from its README, I get this error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/blah/usearch-images/env/lib/python3.12/site-packages/uform/__init__.py", line 187, in get_model
    return get_model_onnx(model_name, device=device, token=token, modalities=modalities)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/blah/usearch-images/env/lib/python3.12/site-packages/uform/__init__.py", line 155, in get_model_onnx
    encoder = TextEncoder(modality_paths.get(Modality.TEXT_ENCODER), device=device)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/blah/usearch-images/env/lib/python3.12/site-packages/uform/onnx_encoders.py", line 109, in __init__
    self.text_encoder_session = ort.InferenceSession(
                                ^^^^^^^^^^^^^^^^^^^^^
  File "/blah/usearch-images/env/lib/python3.12/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 405, in __init__
    raise TypeError(f"Unable to load from type '{type(path_or_bytes)}'")
TypeError: Unable to load from type '<class 'NoneType'>'

Example code, for reference:

import uform

model, processor = uform.get_model('unum-cloud/uform-vl-multilingual-v2')

Has something changed so this model is no longer supported or something? I originally ran into this issue with the streamlit demo, though I notice that seems a bit dated.

ashvardanian commented 3 months ago

Hey @polm-stability! Yes, can you please upgrade to the v3 models for embeddings? The weights would be the same, but the files under the new handle are organized in a way compatible with the new codebase. Please let me know if it works 🤗

polm-stability commented 2 months ago

Sorry for the delayed followup, thanks for the clarification.

Is the v3 model unum-cloud/uform3-image-text-multilingual-base? It looks that way but given how different the name was I wasn't sure if it was a new version or a separate model.

In either case, I was able to load that model successfully using the README instructions.

If v2 is no longer supported maybe the README for it should be updated?

ashvardanian commented 2 months ago

Is the v3 model unum-cloud/uform3-image-text-multilingual-base?

Yes 🤗