ageron / handson-ml2

A series of Jupyter notebooks that walk you through the fundamentals of Machine Learning and Deep Learning in Python using Scikit-Learn, Keras and TensorFlow 2.
Apache License 2.0
28.04k stars 12.81k forks source link

Datasets in Chapter 15 Exercises are not found #100

Open uzi0espil opened 4 years ago

uzi0espil commented 4 years ago

The datasets mentioned in questions 9 and 10 of chapter 15 are not found. For SketchRNN there are four variants of the dataset but tensorflow-datasets have only the quickdraw_bitmap dataset which I think it is more suitable for image classification rather than sequence. Also, the link in question 10 (https://homl.info/bach) yield to 404 page not found.

lord8266 commented 4 years ago

the bach chorales is here https://github.com/ageron/handson-ml2/tree/master/datasets/jsb_chorales

uzi0espil commented 4 years ago

@lord8266 Thank you

andrewy12 commented 4 years ago

Is the quickdraw_bitmap dataset supposed to be 36 GB, because I can't fit that on my computer.

ageron commented 4 years ago

Hi @uzi0espil ,

Thanks for your question. I'm sorry about the SketchRNN dataset issue, there was a TFDS Pull Request that seemed ready to merge, about a year ago, and the discussions were looking good, so I assumed it would be included in a matter of weeks, and I decided to include it in the book. Unfortunately this PR hasn't been merged yet, I should have double-checked before the book came out (I usually wrote a TODO:CHECK for myself every time I wrote about something that was supposed to be released later, but in this case I forgot, probably because it was in an exercise).

Anyway, the good news is that the dataset is available as convenient TFRecords file. I uploaded the solutions to the exercises in chapter 15, in case you want to take a look.

I'll leave this issue open until the SketchRNN dataset is available in TFDS... hopefully one day! ;-)