Open RodenLuo opened 5 months ago
Similarly, for valid
:
8661
14553
12350
22931
Also, I could not find any of the test split IDs in the dataset.
Hello RodenLuo, I have updated the 'metadata' file here, could you please use them.
Hi Nabin, Sorry for the late reply. Was traveling to several conferences.
The problem still exists on my side. I'm using the previously downloaded EMD folder and the new metadata file. I notice this time that there are two kinds of issues. One is, e.g., "2278" is the first in the TEST tab, but it is not inside the EMD folder. The second is, e.g., "903" is in the VALID tab, but only "0903" is in the EMD folder.
I attached the output of ls EMD > EMD_list.txt
and the IDs in each split on my end for your reference.
EMD_list.txt split_valid_new.txt split_train_new.txt split_test_new.txt
Hello @RodenLuo,
TEST
tab of the metadata.xlsx
file are the IDs of test data. These test data were filtered out from the Full Dataset
. The Full Dataset
is used for training and validating the models. The test data files are available in another repository : https://doi.org/10.7910/DVN/2GSSC9 .903
shown in the VALID
tab should actually be 0903
. Somehow, the leading zeros were accidentally removed in the Excel sheet. Now it's fixed, ensuring that there are always four digits in the EMD-ID name. The fixed excel sheet is available here : https://doi.org/10.7910/DVN/JMN60H .
Hi,
I downloaded from here the splits info and here the full dataset.
Some EMD IDs (full list below) exist in the train split but not in the dataset. Did I make any mistakes or those were removed later on?
Thanks