embeddings-benchmark / mteb

MTEB: Massive Text Embedding Benchmark
https://arxiv.org/abs/2210.07316
Apache License 2.0
1.93k stars 268 forks source link

[mieb] TIGER-Lab/VLM2Vec-LoRA fails #1377

Open Muennighoff opened 4 days ago

Muennighoff commented 4 days ago
INFO:mteb.cli:Running with parameters: Namespace(model='TIGER-Lab/VLM2Vec-LoRA', task_types=None, categories=None, tasks=['BLINKIT2IRetrieval'], languages=None, benchmarks=None, device=None, output_folder='/data/niklas/mieb/results-mieb-final', verbosity=2, co2_tracker=True, eval_splits=None, model_revision=None, batch_size=4, overwrite=False, save_predictions=False, func=<function run at 0x7feee99e23b0>)
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]
Loading checkpoint shards:  50%|█████     | 1/2 [00:04<00:04,  4.68s/it]
Loading checkpoint shards: 100%|██████████| 2/2 [00:07<00:00,  3.42s/it]
Loading checkpoint shards: 100%|██████████| 2/2 [00:07<00:00,  3.61s/it]
/env/lib/conda/gritkto4/lib/python3.10/site-packages/transformers/models/auto/image_processing_auto.py:520: FutureWarning: The image_processor_class argument is deprecated and will be removed in v4.42. Please use `slow_image_processor_class`, or `fast_image_processor_class` instead
  warnings.warn(
INFO:mteb.evaluation.MTEB:

## Evaluating 1 tasks:
─────────────────────────────── Selected tasks  ────────────────────────────────
Any2AnyRetrieval
    - BLINKIT2IRetrieval, it2i

INFO:mteb.evaluation.MTEB:

********************** Evaluating BLINKIT2IRetrieval **********************
INFO:mteb.evaluation.MTEB:Loading dataset for BLINKIT2IRetrieval
INFO:mteb.abstasks.Image.AbsTaskAny2AnyRetrieval:Loading Corpus...
INFO:mteb.abstasks.Image.AbsTaskAny2AnyRetrieval:Loaded 804 TEST Documents.
INFO:mteb.abstasks.Image.AbsTaskAny2AnyRetrieval:Doc Example: {'id': 'val_Art_Style_1_2', 'modality': 'image', 'image': <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=1382x1718 at 0x7FF21EA29540>}
INFO:mteb.abstasks.Image.AbsTaskAny2AnyRetrieval:Loading Queries...

Map:   0%|          | 0/402 [00:00<?, ? examples/s]
Map: 100%|██████████| 402/402 [00:00<00:00, 43397.17 examples/s]
INFO:mteb.abstasks.Image.AbsTaskAny2AnyRetrieval:Loaded 402 TEST Queries.
INFO:mteb.abstasks.Image.AbsTaskAny2AnyRetrieval:Query Example: {'id': 'val_Art_Style_1_1', 'modality': 'image,text', 'text': 'Which image shares the same style as the reference image?', 'image': <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=1382x2057 at 0x7FF21E967A60>}
INFO:mteb.abstasks.Image.AbsTaskAny2AnyRetrieval:Subset: default
INFO:mteb.evaluation.evaluators.Image.Any2AnyRetrievalEvaluator:Encoding Queries.

  0%|          | 0/101 [00:00<?, ?it/s]
  0%|          | 0/101 [00:00<?, ?it/s]
ERROR:mteb.evaluation.MTEB:Error while evaluating BLINKIT2IRetrieval: expected mat1 and mat2 to have the same dtype, but got: c10::BFloat16 != float
Traceback (most recent call last):
  File "/env/lib/conda/gritkto4/bin/mteb", line 8, in <module>
    sys.exit(main())
  File "/data/niklas/mieb/mteb/mteb/cli.py", line 387, in main
    args.func(args)
  File "/data/niklas/mieb/mteb/mteb/cli.py", line 145, in run
    eval.run(
  File "/data/niklas/mieb/mteb/mteb/evaluation/MTEB.py", line 464, in run
    raise e
  File "/data/niklas/mieb/mteb/mteb/evaluation/MTEB.py", line 412, in run
    results, tick, tock = self._run_eval(
  File "/data/niklas/mieb/mteb/mteb/evaluation/MTEB.py", line 300, in _run_eval
    results = task.evaluate(
  File "/data/niklas/mieb/mteb/mteb/abstasks/Image/AbsTaskAny2AnyRetrieval.py", line 269, in evaluate
    scores[hf_subset] = self._evaluate_subset(
  File "/data/niklas/mieb/mteb/mteb/abstasks/Image/AbsTaskAny2AnyRetrieval.py", line 278, in _evaluate_subset
    results = retriever(corpus, queries)
  File "/data/niklas/mieb/mteb/mteb/evaluation/evaluators/Image/Any2AnyRetrievalEvaluator.py", line 290, in __call__
    return self.retriever.search(
  File "/data/niklas/mieb/mteb/mteb/evaluation/evaluators/Image/Any2AnyRetrievalEvaluator.py", line 144, in search
    query_embeddings = self.model.get_fused_embeddings(
  File "/data/niklas/mieb/mteb/mteb/models/vlm2vec_models.py", line 243, in get_fused_embeddings
    text_embeddings = self.get_text_embeddings(texts, batch_size)
  File "/data/niklas/mieb/mteb/mteb/models/vlm2vec_models.py", line 223, in get_text_embeddings
    text_outputs = self.encode_input(inputs)
  File "/data/niklas/mieb/mteb/mteb/models/vlm2vec_models.py", line 91, in encode_input
    hidden_states = self.mdl(**input, return_dict=True, output_hidden_states=True)
  File "/env/lib/conda/gritkto4/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/env/lib/conda/gritkto4/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/data/huggingface/modules/transformers_modules/microsoft/Phi-3.5-vision-instruct/4a0d683eba9f1d0cbfb6151705d1ee73c25a80ca/modeling_phi3_v.py", line 1603, in forward
    outputs = self.model(
  File "/env/lib/conda/gritkto4/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/env/lib/conda/gritkto4/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/data/huggingface/modules/transformers_modules/microsoft/Phi-3.5-vision-instruct/4a0d683eba9f1d0cbfb6151705d1ee73c25a80ca/modeling_phi3_v.py", line 1479, in forward
    layer_outputs = decoder_layer(
  File "/env/lib/conda/gritkto4/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/env/lib/conda/gritkto4/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/data/huggingface/modules/transformers_modules/microsoft/Phi-3.5-vision-instruct/4a0d683eba9f1d0cbfb6151705d1ee73c25a80ca/modeling_phi3_v.py", line 1177, in forward
    attn_outputs, self_attn_weights, present_key_value = self.self_attn(
  File "/env/lib/conda/gritkto4/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/env/lib/conda/gritkto4/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/data/huggingface/modules/transformers_modules/microsoft/Phi-3.5-vision-instruct/4a0d683eba9f1d0cbfb6151705d1ee73c25a80ca/modeling_phi3_v.py", line 765, in forward
    qkv = self.qkv_proj(hidden_states)
  File "/env/lib/conda/gritkto4/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/env/lib/conda/gritkto4/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/env/lib/conda/gritkto4/lib/python3.10/site-packages/torch/nn/modules/linear.py", line 117, in forward
    return F.linear(input, self.weight, self.bias)
RuntimeError: expected mat1 and mat2 to have the same dtype, but got: c10::BFloat16 != float
isaac-chung commented 3 days ago

Investigating. I think I have a rough idea. [edit] I am able to reproduce this error. Working on a fix.