gimseng / 99-ML-Learning-Projects

A list of 99 machine learning projects for anyone interested to learn from coding and building projects
MIT License
576 stars 174 forks source link

Malaria soln and data #92

Open AjayKhalsa opened 3 years ago

AjayKhalsa commented 3 years ago

I have added the data and VGG19 solution and will start creating the exercise asap

gimseng commented 3 years ago

Link to #90 Hi @AjayKhalsa thanks for the PR, I'll review it when I have time later this week.

AjayKhalsa commented 3 years ago

Hi @gimseng I messed up in my malaria branch and deleted the files accidentally while working on the employee attrition exercise. How do I revert it?

gimseng commented 3 years ago

@AjayKhalsa Usually i try to revert it (although its not easy on GitHub interface). Which files specifically? If they are small, sometimes I can look at the history/blame on GitHub and easily retrieve the latest version. After that, the cheap way to do is to download and re-load the files or something, though I'm sure there's a git way to revert to a particular point in history, given if that's the only files that have been changed.

AjayKhalsa commented 3 years ago

Yeah i think it's proper now just check it out I named malaria project as 008 and employee attrition as 009 (different branches)

gimseng commented 3 years ago

@AjayKhalsa Just a heads-up, I'll merge the employee attrition PR #99 and rename it folder 007. I hope that won't conflict with your stuff.

AjayKhalsa commented 3 years ago

@gimseng Do it I have a copy of this project, it won't be an issue

gimseng commented 3 years ago

@AjayKhalsa Thanks ! Please let me know when you are done with this PR, then I can review and merge.

AjayKhalsa commented 3 years ago

@gimseng I've updated my model and readme let me know if any changes are needed.

gimseng commented 3 years ago

Hi @AjayKhalsa thanks ! I seem to have trouble finding the data in the data folder. It seems to only have the readme file. Am I missing something in the code? (maybe you have a helper function to download it from somewhere?)

Otherwise, at first glance, looks great ! I think after the data is loaded, I just need to run through the notebook to make sure there's no error. Thanks !

AjayKhalsa commented 3 years ago

@gimseng it's an image dataset so I thought I would directly add the Kaggle dataset link so that users can directly download it

gimseng commented 3 years ago

@AjayKhalsa I see. So, if I were to run your Jupiter files on google colab, it won't just run? I have to manually download all the images from kaggle, then upload them to google colab, is that right?

Somehow, I prefer a self-contained google colab/notebook. Is it possible to have a helpful function which runs before the main notebook to download the files to local folder?

How big is the dataset? Is the worry that it is too big to upload to the repo data?

AjayKhalsa commented 3 years ago

It's a 300mb dataset, if you want I can upload it. No, you don't need to do all that, You can just copy the API command [kaggle datasets download -d iarunava/cell-images-for-detecting-malaria] Here, https://colab.research.google.com/drive/1JfyoNn4dva5XO3aMtCoF9SWMmy60aeq4?usp=sharing You can directly use the Kaggle API to download the dataset within the notebook and it's much faster

gimseng commented 3 years ago

@AjayKhalsa Thanks for the colab notebook file, that's exactly what I wanted.

  1. Perhaps in the exercise folder, also have the part of the colab notebook to help learners download the data.

  2. I forgot if the current solution notebooks already contain this part of downloading the data. If not, please incorporate them so that learners can just run the notebook and not worrying about dealing with Kaggle.

AjayKhalsa commented 3 years ago

Yeah so I can make a template in the readme so that everyone can use it? and would update the existing notebooks

gimseng commented 3 years ago

@AjayKhalsa Excellent ! Looking forward to the final update. Just buzz it here when you are done. Thanks !