pratik-choudhari / AlgoCode

Welcome everyone!🌟 Here you can solve problems, build scrappers and much moreπŸ’»
https://github.com/pratik-choudhari/AlgoCode
MIT License
131 stars 166 forks source link

Calling out for Data science people!πŸ’»πŸ“ˆ #162

Open pratik-choudhari opened 4 years ago

pratik-choudhari commented 4 years ago

Current situation

As of now this repository has been accepting contributions for solving coding problems. Which is going pretty good.

So now what?

Now we are accepting contributions for some beautiful and insightful EDAs on dataset of your choice. Yes! a dataset of your choice. The rules for EDA contribution:

You may leave any queries below

Mrsterius commented 4 years ago

Hi Pratik, I would like to add the EDA for Boston Housing dataset. Can I work on this?

pratik-choudhari commented 4 years ago

@Mrsterius Yes

Mrsterius commented 4 years ago

Hey Pratik, I have created EDA for the AMES Advanced house prices (regression) dataset since column names were not mentioned for the Boston one. What all do you want me to add in the README? Do I need to upload the dataset as well or I can just give it's link in README?

Mrsterius commented 4 years ago

Submitted. Please let me know if there any changes to be made.

PrattJena commented 4 years ago

Hey Pratik I would like to add an EDA of dogs vs cats. Can i work on it?

pratik-choudhari commented 4 years ago

@PrattJena What in cats and dogs? The image dataset right?

PrattJena commented 4 years ago

@pratik-choudhari Yes the image classification dataset

pratik-choudhari commented 4 years ago

@PrattJena Could you elaborate on what will be in the EDA?

PrattJena commented 4 years ago

Actually now that I think about it. Its much better suited for projects folder as the Neural network determines whether the image is of cat or a dog

pratik-choudhari commented 4 years ago

Yep so that doesn't count as EDA.

rajpratyush commented 4 years ago

EDA for Fashion MNIST dataset and CIFAR 10 data set

pratik-choudhari commented 4 years ago

@rajpratyush What will be the contents in your EDA?

rajpratyush commented 4 years ago

I have done several assignments on these two datasets while learning through an MOOC on Coursera Platform. I would like to share those.

pratik-choudhari commented 4 years ago

@rajpratyush EDA on a image dataset, won't it be vague? On a dataset like CIFAR 10 the maximum one could get from EDA is the number of images in every category, correct me if there is any addition to this.

rajpratyush commented 4 years ago

I agree then how about an EDA ON a data set of toxic words i remeber there was a kaggle competition regarding this haad a successful contribution in this

pratik-choudhari commented 4 years ago

@rajpratyush Yes that will suffice.

rajpratyush commented 4 years ago

@pratik-choudhari btw you have assigned me this issue yet https://github.com/pratik-choudhari/AlgoCode/issues/94#issue-714136213

pratik-choudhari commented 4 years ago

@drashtipatel2503 Sure! Put a link and name on data set on issue #161

sanskritip commented 4 years ago

Hey Pratik! Can I add EDA for StackOverflow Developer 2019 Dataset? I will be adding the CSV files, Data Cleaning Jupyter Notebook and additionally the Visualisation Notebook.

pratik-choudhari commented 4 years ago

@sanskritip Sure, don't include the CSV files it will increase the repo size instead put the link of dataset.

sanskritip commented 4 years ago

@sanskritip Sure, don't include the CSV files it will increase the repo size instead put the link of dataset.

Awesome Sounds great!