Open Shmily17 opened 4 months ago
Hi
Thanks so much for your attention.
Best regards,
First of all, thank you for your patience! I still have a small question, can you tell me how Tokenizer is done? This is because the output of a normal BERT tokenizer is not the same as the content of the ans_tokenizer_dict key provided. (ans_tokenizer_dict looks like a specific Tokenizer method is used)
Thank you very much!
Hi Please try to use the tokenizer of the Roberta-base, which is the same as the LaPA's language part.
Best regards,
Thank you so much for your patience! Have a great day!
Hi
Thanks so much for your attention.
- Sorry for that loss code for them; the process only includes saving and transferring. The method for generating ans_tokenizer_dict : save all answer tokens, including training, validation, and test sets. Then, tokenize them and make sure each answer tokens corresponds to each item.
- Download the VQA_RAD dataset by following the README file.
Best regards,
Please,How do you get the dimension (1,2978) of ans_tokenizer? I input ans_list with a length of 3579 into the tokenizer and can only get a tensor of (3579,27). (max_length=27). Please tell me how to set the formal parameters of tokenizer when encoding ans_list to get ans_tokenizer with dimension (1,2978).
Hello
I think you are doing a fantastic job! However, I am now having some problems reproducing your experimental results, the problem reads as follows: i am reproducing the VQA_RAD dataset very poorly, and after checking it I found that
The labels of ans_tokenizer_dict.pkl of the VQA_RAD dataset you provided do not correspond to the content labels of the VQA_RAD dataset, could you please provide how ans_list.pkl is encoded into ans_tokenizer_dict.pkl, or I hope that you can provide the code for the conversion.
Thank you very much! Have a nice life!