arthurdouillard / dytox

Dynamic Token Expansion with Continual Transformers, accepted at CVPR 2022
https://arxiv.org/abs/2111.11326
Apache License 2.0
134 stars 17 forks source link

Accuracy variation with the CLIP model #13

Closed vgthengane closed 2 years ago

vgthengane commented 2 years ago

I am trying to integrate the CLIP model with DyTox. The evaluation accuracy with default config for task=0, after epoch=0, and num_gpus=4 is 60.5%.

When I add a single line _clip_model, _clip_transform = clip.load("ViT-B/32", device=device) the accuracy jumps from 60.5% to 75.0%.

Similarly for _clip_model, _clip_transform = clip.load("ViT-B/16", device=device) the accuracy is 73.5%.

I only instantiate the CLIP model above the scenario_train loop here and did not change anything else.

Do you have any idea what might be causing this issue?

Thanks, Vishal

arthurdouillard commented 2 years ago

Hum I'm sorry but I don't understand what you're doing.

You said "_clip_model, _clip_transform = clip.load("ViT-B/32", device=device)", but what were you doing before then? Creating another ViT of different size?