Open ap0nia opened 1 year ago
Dataset 148 currently only links to a zip archive with no data but one empty folder called Graphics
. The downloaded archive is only 116 bytes in size.
Thank you for informing us. This should be rectified now. I can find 4 files in the downloaded zip file. Please let us know if this isn't the case for you, thanks!
Link to the dataset page: https://archive.ics.uci.edu/dataset/148/statlog+shuttle
Hey, thanks so much for the fast response! Yes, all 4 files are there.
I wonder if it is intentional that the training data is still compressed after unzip-ing the downloaded archive while the test data is not? One can get the original data by running uncompress shuttle.trn.Z
on any unix, not sure about windows users.
Edit. Ah, just saw that the index file also lists the training data as a compressed file, disregard then :)
Same issue with Census Income (#20)—the zip only contains a "Graphics" folder
Hi Markelle, the abstract of the Census Income dataset says that it is the same as the Adult dataset. We can either copy the Adult files to the Census Income dataset, or remove Census Income altogether. How should we handle this?
Since this dataset is well-known under both names, let's have the data available under both for now (i.e., go ahead and copy the Adult files)—we can discuss combining the two later. thanks!
Dataset 341 is also missing: https://archive.ics.uci.edu/dataset/341/smartphone+based+recognition+of+human+activities+and+postural+transitions
@maxxu05 Fixed, thanks for letting us know.
There is missing data from Dataset 301 "Parkinson Speech Dataset with Multiple Types of Sound Recordings": https://archive.ics.uci.edu/dataset/301/parkinson+speech+dataset+with+multiple+types+of+sound+recordings
It used to include a .rar file that contained the audio files (~20 mb). But not only includes a couple of text files. For example, this snapshot from 2015 shows the full dataset: https://web.archive.org/web/20150208025709/http://archive.ics.uci.edu/ml/machine-learning-databases/00301/
Dataset 28 - Japanese Credit Screening at https://archive.ics.uci.edu/dataset/28/japanese+credit+screening appears to be missing the dataset, the download contains only an empty Graphics folder.
Dataset 84 [Prodigy] currently only links to a zip archive with no data but one empty folder called Graphics.
Dataset 157 [Dodgers Loop Sensor] currently only links to a zip archive with no data but one empty folder called Graphics with two images (the images for the dataset).
just to mention that the file https://archive.ics.uci.edu/static/public/156/calit2+building+people+counts.zip contains 6 files. I think two of them belong to [Dodgers Loop Sensor] dataset, which are:
Dataset 75 [Musk (Version 2)] currently only links to a zip archive with no data but one empty folder called Graphics.
just to mention that the file https://archive.ics.uci.edu/static/public/74/musk+version+1.zip contains 7 files. I think three of them belong to [Musk (Version 2] dataset, which are:
Dataset 91 [Soybean (Small)] currently only links to a zip archive with no data but one empty folder called Graphics.
just to mention that the file https://archive.ics.uci.edu/static/public/90/soybean+large.zip contains 12 files. I think two of them belong to [Soybean (Small)] dataset, which are:
Dataset 96 [SPECTF Heart] currently only links to a zip archive with no data but one empty folder called Graphics.
just to mention that the file https://archive.ics.uci.edu/static/public/95/spect+heart.zip contains 8 files. I think two of them belong to [SPECTF Heart] dataset, which are:
missing on your side :
https://archive.ics.uci.edu/static/public/143/statlog+australian+credit+approval.zip
https://archive.ics.uci.edu/static/public/145/statlog+heart.zip
https://archive.ics.uci.edu/static/public/146/statlog+landsat+satellite.zip
https://archive.ics.uci.edu/static/public/149/statlog+vehicle+silhouettes.zip
https://archive.ics.uci.edu/static/public/100/teaching+assistant+evaluation.zip
https://archive.ics.uci.edu/static/public/150/connectionist+bench+nettalk+corpus.zip
https://archive.ics.uci.edu/static/public/154/protein+data.zip
Another question please, The website currently contains 657 datasets, but the dataset ID reaches 892 Is there private datasets?
When datasets are donated, they have to be approved by admins. There are currently 657 approved datasets, and 892 datasets in total including pending & rejected datasets.
Hello, The datasset 613, Smartphone Dataset for Anomaly Detection in Crowds is also missing.
Thanks.
Also missing: Connectionist Bench (Sonar, Mines vs. Rocks)
We used to have the PIMA Indians dataset (many other websites, e.g., Kaggle attribute it to us), not sure what happened to it
@markellekelly The owners of the PIMA dataset replaced the files with a note.txt that says "Thank you for your interest in the Pima Indians Diabetes dataset. The dataset is no longer available due to permission restrictions."
i also cannot access my dataset and get "DatasetNotFoundError: Error reading data csv file for "Cirrhosis Patient Survival Prediction" dataset (id=878)."
A list of dataset files we believe are missing. Will be updated as they're reported / found. Feel free to comment to report additional ones.
Graphics
folder in its zip file. Located with the originalbreast-cancer-wisconsin
dataset files prefixed withwdbc