Questions about the code implementation

PKU-ICST-MIPL / FGCrossNet_ACMMM2019

Source code of our ACM MM 2019 paper "A New Benchmark and Approach for Fine-grained Cross-media Retrieval".

56 stars 23 forks source link

Questions about the code implementation #5

Closed amyzx closed 2 years ago

amyzx commented 2 years ago

Hi, sorry to bother you. I have the following 4 questions.

I found in your dataset.py, Line 52 seems to be wrong because it needs to be in a for loop. Please check this line If this is correct, it means that CudaTextDataset only processes the last text.
I am curious about the parameter max_length in CudaTextDataset. Why it is 448? Can I change it to other number such as 64 or 128?
Can I process text by calling processtxt before calling forward function? I hope processed text could have the same dimension as images, such as [3, 448, 448].
Can I change the crop_size from 448 to 48, and scale_size from 512 to 64? I hope to reduce the input data. Will this change influence the model training accuracy?

Looking forward to your reply, thanks!

thebestYezhang commented 2 years ago

Hello, I am also curious about their work. About your questions 2，they had described in the paper, because they want to cover all items. So you can change it to smaller numbers. About the question 3, I think they did like you said. As for the last question, certainly it will decrease the score. So, are you doing this work up to now？Or you've done it?

amyzx commented 2 years ago

@thebestYezhang Hi, thanks for your reply. I tried to implement the method in question 3, but not finished until now. I am not working on this project right now.

Rose-bud commented 1 year ago

@thebestYezhang Hi, thanks for your reply. I tried to implement the method in question 3, but not finished until now. I am not working on this project right now.

Hi guys, I've recently wanted to do some work on fine-grained cross-modal retrieval based on this paper, but I've also encountered problems related to e.g. dataset acquisition, code reproduction after getting deeper into the paper, in your opinion, should I continue to go along with this paper? So far the dataset acquisition is much more complex than the other tasks.

Rose-bud commented 1 year ago

Hi guys, I've recently wanted to do some work on fine-grained cross-modal retrieval based on this paper, but I've also encountered problems related to e.g. dataset acquisition, code reproduction after getting deeper into the paper, in your opinion, should I continue to go along with this paper? So far the dataset acquisition is much more complex than the other tasks.