amueller / mglearn

mglearn helper package for "Introduction to Machine Learning with Python"
229 stars 116 forks source link

No datasets in the repository #1

Closed ghost closed 7 years ago

ghost commented 7 years ago

Hi There is no datasets in the repository and all datasets called in the packages seems local to your OS.

amueller commented 7 years ago

I added the bike and ram datasets to the repo with the notebooks: https://github.com/amueller/introduction_to_ml_with_python Do you still have problems with them?

amueller commented 7 years ago

Ah, I guess I should add them here too, though that wasn't really the plan... maybe I shouldn't have made this repo in the first place... hum....

ghost commented 7 years ago

I have no idea which decision is the best one, but mglearn notebooks cannot be properly used by users without its datasets.

In 4th and 6th chapters we need datasets which are not public  in your repo yet.

BTW, mglearn is excellent work. Thanks for sharing it.

⁣Best Regard, Behrooz​

On 4:43PM, Oct 19, 2016, at 4:43PM, Andreas Mueller notifications@github.com wrote:

Ah, I guess I should add them here too, though that wasn't really the plan... maybe I shouldn't have made this repo in the first place... hum....

You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub: https://github.com/amueller/mglearn/issues/1#issuecomment-254808340

amueller commented 7 years ago

which datasets are not included yet?

ghost commented 7 years ago

all the data you used in chapter 4 and chapter 6 of your notebooks here https://github.com/amueller/introduction_to_ml_with_python

amueller commented 7 years ago

oh damn, I made a mistake when pushing the ram prices and citibike. give me a second.

amueller commented 7 years ago

ram prices and citibike are there now.

amueller commented 7 years ago

and adult now, too

amueller commented 7 years ago

anything else missing?

ghost commented 7 years ago

yes, some of datasets you used is on your local machine, for example: reviews_train = load_files("data/aclImdb/train/") here : https://github.com/amueller/introduction_to_ml_with_python/blob/master/07-working-with-text-data.ipynb

amueller commented 7 years ago

clearly not "all datasets" because I just told you that I fixed the three ones mentioned above. Unfortunately my processing of the notebook ate the instructions on how to download this dataset. It's quite big and I can't add it to the repository. I'll add a note into the readme on how to download it.

amueller commented 7 years ago

ok, added the link to the readme. Any more?

ghost commented 7 years ago

Not yet, but will be updated chapter by chapter.

ghost commented 7 years ago

Thanks a lot for your kindness and attention

amueller commented 7 years ago

should be good now.