Closed loserking111 closed 5 days ago
Hi @loserking111 , thanks for your attention to our work!
For the TinyCLIP models trained with manual inheritance, the changes are the number of layers and the hidden size. You can refer to the model configs: https://github.com/wkcn/TinyCLIP/tree/main/src/open_clip/model_configs
Okay, thank you, I would like to ask if it would be better to replace clip with tinyclip in the field of Reid
The model architecture of TinyCLIP is the same as that of CLIP. You can diff their model configs.
My original model was a clip, after just changing the model config。Why am I using the TinyCLIP-ViT-39M-16-Text-19M model, and the parameters always feel that they are not embedded successfully。
You can check whether the shape of each weight is matched.
@wkcn Great work! About how much smaller is this TinyCLIP's size and memory usage compared original CLIP model? Thanks!
Hi @willswordh , thanks for your interest to our work!
Here is the comparison with the original CLIP model. The column named #Params (M)
shows the model size.
I did not record the specific memory usage.
Compared with the original CLIP model, TinyCLIP occupies less memory usage since it has fewer layers and channels.
@wkcn Hey Jackie! Thanks for your response! I have actually tested the Tinyclip model, its processing speed for images is similar to orginal CLIP's. I wonder is that supposed to be like that? I thought smaller parameters and memory usage can speed up TinyCLIP's processing speed. Thanks!
Hi @willswordh , I have uploaded the script to measure the throughput.
https://github.com/wkcn/TinyCLIP/blob/main/measure_throughput.py
Example:
python3 measure_throughput.py --model-name ViT-B-32
python3 measure_throughput.py --model-name TinyCLIP-ViT-61M-32-Text-29M
The model names can be found in https://github.com/wkcn/TinyCLIP/tree/main/src/open_clip/model_configs
@wkcn Thanks Jackie! I tested their throughput, and I noticed that there is not a huge enhancement for the tinyclip compared to original Vit b 32. I want to run CLIP on edge device efficiently with good speed, do you have any advice for it? Maybe quantize the tinyclip model? Thanks a lot for your sincere help!
@willswordh Did you use the inference framework like tensorrt and onnxruntime? I did not try to quantize the TinyCLIP models.
@wkcn Yes I used onnxruntime. The processing speed is still quite low even I have tried to quantized it to fp16.
@willswordh
Did you compare CLIP-ViT-B/32 and TinyCLIP-ViT-40M-32-Text-19M?
Could you please provide more information, such as batch size, device, the inference time of image encoder and text encoder ?
You can increase the batch size as much as possible to measure the speed. In my test code, the batch size is 32.
@wkcn I compared the throughput between the Vit B 32 with the smallest tinyclip model. I wonder what can I do to speedup the inference speed of the model on device like android phone. Maybe quantize it to int8? Thanks!
I did not benchmark TinyCLIP on edge device. The quantization can accelerate the inference, but I am not sure whether TinyCLIP will be faster significantly than ViT-B/32.
I want to replace the clip model weights with Tinyclip model weights to initialize, how should I change the network architecture?