I have downloaded the fullband dataset and I noticed that instead datasets_fullband/clean_fullband/read_speech there is a second read_speech folder which is 117GB. At a first glance, all the files inside datasets_fullband/clean_fullband/read_speech/read_speech are already present that subfolder there are 117 GB of data that seems to be absolutely identical to the one that is already inside datasets_fullband/clean_fullband/read_speech. This seems to be confirmed by the sha1 value inside the file provided:
Is this an error? Did a lot of duplicated data just make it to the zipped archive by mistake? Did it take the place of other data that we were supposed to receive?
I have downloaded the fullband dataset and I noticed that instead
datasets_fullband/clean_fullband/read_speech
there is a secondread_speech
folder which is 117GB. At a first glance, all the files insidedatasets_fullband/clean_fullband/read_speech/read_speech
are already present that subfolder there are 117 GB of data that seems to be absolutely identical to the one that is already insidedatasets_fullband/clean_fullband/read_speech
. This seems to be confirmed by the sha1 value inside the file provided:Is this an error? Did a lot of duplicated data just make it to the zipped archive by mistake? Did it take the place of other data that we were supposed to receive?