ageron / handson-ml2

A series of Jupyter notebooks that walk you through the fundamentals of Machine Learning and Deep Learning in Python using Scikit-Learn, Keras and TensorFlow 2.
Apache License 2.0
27.8k stars 12.74k forks source link

The url to download the housing data set from chapter 2 doesnt work #590

Open Yuvraj102 opened 2 years ago

Yuvraj102 commented 2 years ago

I am following second edition of this book, right now i am on page 46 and Chapter 2 end_to_end_machine_learning_project, the value for the variable DOWNLOAD_ROOT = "https://raw.githubusercontent.com/ageron/handson-ml2/master/", this url send an invalid response, i do not get any tgz file as claimed in the book

ageron commented 2 years ago

Hi @Yuvraj102 ,

Thanks for your feedback. That's just the URL's root directory (the full URL was too long to fit on a single line). The full URL is defined like this:

DOWNLOAD_ROOT = "https://raw.githubusercontent.com/ageron/handson-ml2/master/"
[...]
HOUSING_URL = DOWNLOAD_ROOT + "datasets/housing/housing.tgz"

So the full URL is: https://raw.githubusercontent.com/ageron/handson-ml2/master/datasets/housing/housing.tgz

If you run the whole code, it should work fine. The point of this paragraph is to explain that you don't need to download things manually, you can write functions that do it for you. This becomes useful if you need to download the data multiple times, e.g., on multiple different machines, or on different dates (to get updated data).

Hope this helps