comrados / cpah

8 stars 4 forks source link

数据集问题 #1

Closed yangzhip closed 3 years ago

yangzhip commented 3 years ago

你好,能上传一下其他数据集和对数据集的处理是如何进行的呢

comrados commented 3 years ago

Hello. I'm not the author of the original paper. I didn't see the original code either. Thus, I don't have CV datasets and don't know how they preprocess the data.

My data is from the remote sensing domain (RSICD and UCM datasets):

My code for the data preparation will be available later in another repository.

yangzhip commented 3 years ago

Hello. I'm not the author of the original paper. I didn't see the original code either. Thus, I don't have CV datasets and don't know how they preprocess the data.

My data is from the remote sensing domain (RSICD and UCM datasets):

  • BERT (bert-base-uncased, you can find the model on Hugging Face) for caption feature encoding. I use a sum over last 4 hidden states -> (768,) text feature vectors
  • ResNet18 (trained on ImageNet, available as a module of torchvision) for image feature encoding. Last classification layer is removed - > (512,) image feature vectors

My code for the data preparation will be available later in another repository. Thanks for the answer.Is the experimental result of your code consistent with the original paper?

comrados commented 3 years ago

I didn't test it on the original CV data. But for my datasets the performance was good.

yangzhip commented 3 years ago

I didn't test it on the original CV data. But for my datasets the performance was good.

ok,thank you a lot!