mingkai-zheng / WCL

Weakly Supervised Contrastive Learning
39 stars 7 forks source link

import mc #2

Closed Elaineok closed 2 years ago

Elaineok commented 2 years ago

There are no “mc” package in line5 of “imagenet.py” file.

image

Can you help me solve this problem? Looking forward to your reply.

mingkai-zheng commented 2 years ago

Hello Jingting, the mc module is a caching library for speeding up the image io process, this is a special module that we used internally, so you can just simply delete the related code. As I have mentioned in README file, "You need to change the Dataset setting (dataset/imagenet.py), and Pytorch DDP setting (util/dist_init.py) for your own server environments."

Elaineok commented 2 years ago

Thank you for your reply!I have deleted the related code.

Can you provide a simple demo dataset and upload it to Baidu Netdisk? For example: contain a few images、train.txt.

Looking forward to your reply.

mingkai-zheng commented 2 years ago

I think the most straightforward way is to replace the dataset with cifar10, you can simply follow the official PyTorch tutorial. BTW, since the resolution of cifar10 images is relatively small (32x32), we generally need to replace the first 7x7 Conv of stride 2 with 3x3 Conv of stride 1 and also remove the first max pool operation for our backbone network.

Elaineok commented 2 years ago

It's a great suggestion. I want to have a try.

Elaineok commented 2 years ago

I tried, but found that the data loading part needs more changes. Can you provide a simple demo dataset?

Thank you very much.

Elaineok commented 2 years ago

I noticed that the data set and code need access permission, can you provide me with permission? https://drive.google.com/file/d/1j2I1Lh9Dy7cHb6YO0PZ8HXDNewXrHO-j/view?usp=sharing image

mingkai-zheng commented 2 years ago

sorry for the late response, I have updated the permission, please try it again.

Elaineok commented 2 years ago

Since I cannot access to distributed operations at will, I have to change the code. At present, I am not very clear about what these sentence means. image

if feat1=(128,4096),128 represents batch_size,4096 represents feature_dimentation

Looking forward to your reply.

mingkai-zheng commented 2 years ago

The shape of feat1 is (batch_size_per_gpu, dim)

The shape of other1 is (batch_size_per_gpu * (world_size - 1), dim),

The concat_other_gather function returns the features from other GPUs.

torch.cat([feat1, feat2]) will concat all the features in current GPU.

torch.cat([feat1, feat2, other1, other2]) will concat all the features across all the GPUs.

If you only have a single GPU, remove line 55 and 56 and change line 59 to prob = torch.cat([feat1, feat2]) @ torch.cat([feat1, feat2]).T / t

a139122679 commented 2 years ago

I want to know how to delete the code of this MC module, I tried to delete but failed to run, or can you provide an MC package, thank you.

mingkai-zheng commented 2 years ago

I want to know how to delete the code of this MC module, I tried to delete but failed to run, or can you provide an MC package, thank you.

You have to rewrite the dataset class by yourself, the code I provided is based on my own server environment. I'm not able to provide the MC package since it is an internal library just for our server environment.

pc-cp commented 11 months ago

Very effective communication, lots of gains. 👍