dmarx / Multi-Modal-Comparators

Unified API to facilitate usage of pre-trained "perceptor" models, a la CLIP
39 stars 4 forks source link

mlf_vit-b/16+ loads but throws tensor shape error attempting image embedding #23

Closed dmarx closed 2 years ago

dmarx commented 2 years ago

Also to do: add tests for this model to MLF test script

2022-05-07 19:33:23.102 | INFO     | __main__:parse_scenes:133 - Prompts loaded.
2022-05-07 19:33:23.110 | INFO     | __main__:do_run:540 - Settings saved to /home/ubuntu/pytti-core/images_out//clip_mlf_vitb16plus_e32/clip_mlf_vitb16plus_e32_settings.txt
2022-05-07 19:33:23.118 | INFO     | __main__:do_run:553 - Running prompt:
  0%|                                                                                                                                                   | 0/6000 [00:00<?, ?it/s]/home/ubuntu/venv/lib/python3.9/site-packages/torch/functional.py:445: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at  ../aten/src/ATen/native/TensorShape.cpp:2157.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
  0%|                                                                                                                                                   | 0/6000 [00:00<?, ?it/s]
Error executing job with overrides: ['conf=cloob_test', 'steps_per_scene=6000', 'border_mode=wrap', 'file_namespace=clip_mlf_vitb16plus_e32', '++mmc_models=[{architecture: clip, publisher: mlfoundations, id: ViT-B-16-plus-240--laion400m_e32 }]']
Traceback (most recent call last):
  File "/home/ubuntu/venv/lib/python3.9/site-packages/pytti/workhorse.py", line 607, in _main
    do_run()
  File "/home/ubuntu/venv/lib/python3.9/site-packages/pytti/workhorse.py", line 554, in do_run
    i += model.run_steps(
  File "/home/ubuntu/venv/lib/python3.9/site-packages/pytti/ImageGuide.py", line 188, in run_steps
    losses = self.train(
  File "/home/ubuntu/venv/lib/python3.9/site-packages/pytti/ImageGuide.py", line 295, in train
    image_embeds, offsets, sizes = self.embedder(self.image_rep, input=z)
  File "/home/ubuntu/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/ubuntu/venv/lib/python3.9/site-packages/pytti/Perceptor/Embedder.py", line 169, in forward
    image_embeds.append(perceptor.encode_image(clip_in).float().unsqueeze(0))
  File "/home/ubuntu/venv/lib/python3.9/site-packages/mmc/mock/openai.py", line 60, in encode_image
    return project(image)
  File "/home/ubuntu/venv/lib/python3.9/site-packages/open_clip/model.py", line 415, in encode_image
    return self.visual(image)
  File "/home/ubuntu/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/ubuntu/venv/lib/python3.9/site-packages/open_clip/model.py", line 273, in forward
    x = x + self.positional_embedding.to(x.dtype)
RuntimeError: The size of tensor a (197) must match the size of tensor b (226) at non-singleton dimension 1
dmarx commented 2 years ago

whew