Closed ByZ0e closed 2 years ago
Hi @Zoe-Ziyi, did you manage to figure out what is going on? I have also realised that all questions in txt are binary questions yet the csv contains a much greater variety.
Hi Tangolin,
My apologies, Zoe-Ziyi reached out offline and we discussed there. We have released several versions of AGQA with minor updates, so Zoe-Ziyi had downloaded the data from the csv and the txt files at different times and therefore was comparing different versions. I checked the current version, and I found that the two versions have the same questions with the same keys, and include the binary questions.
However, I want to make sure that you are able to get the information you need with the dataset version you have. What size is your current csv? If you email me directly (found here: https://madeleinegrunde.github.io/), then I can also discuss there.
The data in provided CSV files (1024871 questions in balanced for testing) are different from the txt files (1151779 questions in balanced for testing). And the overlap is very small...So how to find the corresponding between the them. Which should I use for evaluation?