Closed drunkpig closed 1 year ago
Hi,
Thanks for raising the question! May I confirm the version of data you are using? We do notice the issue you described in v1 (see issue#11), and have fixed this issue in v1.1, which is released on May 15, 2023. Could you please also provide the shard index with multiple images being assigned with the same text index so that we can take a closer look?
Hi @drunkpig --- did you check out v1.1 which fixes this issue? if that works, I can close this issue
@jmhessel I checked V1.1, the problem has gone.
awesome!
data
The paper
my problem
There are 5 different images bind to the same sentence with index 0 in the data example above.
And There are 12927533 data items with the same issue.
my solution
ouput format : (image_index, sentence_index)