Open BetterZH opened 1 week ago
Hi, I checked the apolo.json
on my end and did not find the error.
The version of artemis_dataset_release_v0.csv
of mine contains 454684 samples, and thus the out-of-index error did not occur.
I also checked the painting names, and they all matched.
Hummm... would there be some missing samples in the new version of artemis_dataset_release_v0.csv
...?
Hi, I request a new version of artemis_dataset_release_v0.csv
from https://www.artemisdataset.org/#dataset and find that they have exactly the same amount of samples as I had (454684 samples).
The code returns the same result as I got in the last comment.
Can you check the file on your end or request a new version of artemis_dataset_release_v0.csv
?
Hi, I request a new version of
artemis_dataset_release_v0.csv
from https://www.artemisdataset.org/#dataset and find that they have exactly the same amount of samples as I had (454684 samples).The code returns the same result as I got in the last comment.
Can you check the file on your end or request a new version of
artemis_dataset_release_v0.csv
?
Thank you for your reply! I would like to confirm, did you apply for the zip dataset by filling out this form?
The issue has been resolved, and the data is indeed correct. I had previously opened the file in Excel, and sometimes opening a CSV file in Excel can cause automatic adjustments to encoding or formatting, which can affect the accuracy of the data. Thank you very much for your help!
I encountered an issue with the 'artemis_id' in both the apolo.json' file and the 'artemis_dataset_release_v0.csv file.
According to the Dataset Preparation instructions, I downloaded the artemis_dataset_release_v0.csv file from https://www.artemisdataset.org/#dataset. However, during steps 8 and 9, I found the following issues:
The total number of data entries in the artemis_dataset_release_v0.csv file is 454,677, but the artemis_id in the apolo/artemis_index/train_index.json file contains IDs that exceed this range, such as 454,682 and 454,681. Could this be a mistake, or is there an explanation for why these IDs are out of range?
I have added code (highlighted in the red box in the image below) to perform the matching check. I found that nearly 30% of the data (1,728 entries) in the apolo.json file has artemis_id values that do not match their corresponding positions in the artemis_dataset_release_v0.csv file. This discrepancy has me quite confused—could you kindly help confirm if this is expected, or if there may be an issue with the ID assignment?
I would greatly appreciate any guidance on how to resolve these discrepancies.
Thank you very much for your help!