apple / ml-cvnets

CVNets: A library for training computer vision networks
https://apple.github.io/ml-cvnets
Other
1.76k stars 224 forks source link

Using vision transformers for different image resolutions #95

Open Oussamab21 opened 11 months ago

Oussamab21 commented 11 months ago

Hi, I ma working on using vision transformers not only the vanilla ViT, but different models on UMDAA2 data set, this data set has an image resolution of 128*128 would it be better to transform the images into the vit desired resolution like 224 or 256 or it is better to keep the 128 and try to update the other vision transformer parameters to this resolution like the dim,depth,heads ?

Tranbaber commented 11 months ago

@Oussamab21 Hello! I'm trying to train MobileViT model, but I'm having the following problem and am asking for help

File "C:\Users\72344.conda\envs\MobileViTv2\Scripts\cvnets-train.exemain.py", line 4, in ModuleNotFoundError: No module named 'main_train'

And I tried to download this module, but show "ERROR: Could not find a version that satisfies the requirement main_train (from versions: none) ERROR: No matching distribution found for main_train"

Can you tell what can I do? Thank you very much!