huggingface / transformers

๐Ÿค— Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
135.75k stars 27.18k forks source link

AssertionError when converting openai clip's weight to hf #22739

Closed wingvortex closed 1 year ago

wingvortex commented 1 year ago

System Info

Who can help?

@amyeroberts

Information

Tasks

Reproduction

Hi, I'm trying to convert huggingface's clip weight back to openai's because I need to adapt a finetuned model, but it seems there is no such script available. Luckily I found one that converts clip from openai to hugginface here: https://github.com/huggingface/transformers/blob/main/src/transformers/models/clip/convert_clip_original_pytorch_to_hf.py

So I start with this script. But When I run it:

python convert_clip_original_pytorch_to_hf.py --checkpoint_path 'path/to/ViT-B-32.pt' --pytorch_dump_folder_path './'

I got the following error:

โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ Traceback (most recent call last) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ โ”‚ /home/test/convert_clip_original_pytorch_to_hf.py:1 โ”‚ โ”‚ 48 in โ”‚ โ”‚ โ”‚ โ”‚ 145 โ”‚ parser.add_argument("--config_path", default=None, type=str, help="Path to hf config โ”‚ โ”‚ 146 โ”‚ args = parser.parse_args() โ”‚ โ”‚ 147 โ”‚ โ”‚ โ”‚ โฑ 148 โ”‚ convert_clip_checkpoint(args.checkpoint_path, args.pytorch_dump_folder_path, args.co โ”‚ โ”‚ 149 โ”‚ โ”‚ โ”‚ โ”‚ /home/anaconda3/envs/hf/lib/python3.7/site-packages/torch/autograd โ”‚ โ”‚ /grad_mode.py:27 in decorate_context โ”‚ โ”‚ โ”‚ โ”‚ 24 โ”‚ โ”‚ @functools.wraps(func) โ”‚ โ”‚ 25 โ”‚ โ”‚ def decorate_context(*args, *kwargs): โ”‚ โ”‚ 26 โ”‚ โ”‚ โ”‚ with self.clone(): โ”‚ โ”‚ โฑ 27 โ”‚ โ”‚ โ”‚ โ”‚ return func(args, **kwargs) โ”‚ โ”‚ 28 โ”‚ โ”‚ return cast(F, decorate_context) โ”‚ โ”‚ 29 โ”‚ โ”‚ โ”‚ 30 โ”‚ def _wrap_generator(self, func): โ”‚ โ”‚ โ”‚ โ”‚ /home/test/convert_clip_original_pytorch_to_hf.py:1 โ”‚ โ”‚ 36 in convert_clip_checkpoint โ”‚ โ”‚ โ”‚ โ”‚ 133 โ”‚ pt_logits_per_image, pt_logits_per_text = pt_model(pixel_values, input_ids) โ”‚ โ”‚ 134 โ”‚ โ”‚ โ”‚ 135 โ”‚ assert torch.allclose(hf_logits_per_image, pt_logits_per_image, atol=1e-3) โ”‚ โ”‚ โฑ 136 โ”‚ assert torch.allclose(hf_logits_per_text, pt_logits_per_text, atol=1e-3) โ”‚ โ”‚ 137 โ”‚ โ”‚ โ”‚ 138 โ”‚ hf_model.save_pretrained(pytorch_dump_folder_path) โ”‚ โ”‚ 139 โ”‚ โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ

Expected behavior

A hf's clip weight should be generated from original openai's by running this script.

amyeroberts commented 1 year ago

Hi @wingvortex, thanks for raising this issue.

The trackback in the issue description doesn't contain the error message - could you share that please? Scratch that: I see it's in the torch.allclose assert

For the checkpoints being converted, could you confirm which of the ViT-B-32 checkpoints are being used e.g. ('ViT-B-32', 'laion400m_e31'),

wingvortex commented 1 year ago

@amyeroberts It's a checkpoint downloaded by clip.load('ViT-B/32')

amyeroberts commented 1 year ago

@wingvortex Thanks again for reporting and the additional info. I managed to track it down to an indexing error in the conversion script, which should be resolved when #22776 is merged.

Yufang-Liu commented 8 months ago

When I upgrade transformers to 4.39, the script still got the assertion error when convert 'ViT-B-32'. Version 4.26.1 is ok.

amyeroberts commented 8 months ago

Hi @Yufang-Liu, could you share a producible snippet to show how you're calling the conversion script, as well as share the checkpoint being converted and a full traceback of the error raised?

Yufang-Liu commented 8 months ago

with transformers==4.39.0, run with python .\convert_clip_original_pytorch_to_hf.py --checkpoint_path ViT-B/32 --pytorch_dump_folder_path ./, the error is:

  File "C:\Users\Yufang Liu\Downloads\convert_clip_original_pytorch_to_hf.py", line 148, in <module>
    convert_clip_checkpoint(args.checkpoint_path, args.pytorch_dump_folder_path, args.config_path)
  File "D:\anaconda3\envs\pt\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "C:\Users\Yufang Liu\Downloads\convert_clip_original_pytorch_to_hf.py", line 135, in convert_clip_checkpoint
    assert torch.allclose(hf_logits_per_image, pt_logits_per_image, atol=1e-3)
AssertionError```

with transformers==4.30.0, the error is gone. I also tried 4.32.0, same error.

amyeroberts commented 8 months ago

Ah, OK. This is a different issue to the one originally raised, and looks like a model regression. Could you open a new issue including these details? This will help us better track and see when something has been resolved

Yufang-Liu commented 8 months ago

ok