DistrictDataLabs / yellowbrick

Visual analysis and diagnostic tools to facilitate machine learning model selection.
http://www.scikit-yb.org/
Apache License 2.0
4.26k stars 555 forks source link

Write a blog post on a machine learning project highlighting Yellowbrick. (GSoC-2019) #691

Closed wagner2010 closed 5 years ago

wagner2010 commented 5 years ago

Write a blogpost using a data set of your choice highlighting Yellowbrick. Please begin by reviewing our QuickStart guide (http://www.scikit-yb.org/en/latest/quickstart.html), complete the walkthrough (http://www.scikit-yb.org/en/latest/quickstart.html#walkthrough), our Model Selection tutorial (http://www.scikit-yb.org/en/latest/tutorial.html) and review our Contributor section (http://www.scikit-yb.org/en/latest/contributing.html). Use a dataset of your choice. Some good sites for data include Data.Gov (data.gov), UCI Machine Learning Repository (http://archive.ics.uci.edu/ml/index.php), and Kaggle (https://www.kaggle.com). Run through a machine learning project using Jupyter Notebooks (data ingestion, storage, wrangling, statistical computation, model selection, machine learning with Yellowbrick and visualization with Yellowbrick). Use this work to formulate your blogpost and work with the Yellowbrick team to get it reviewed and published in the forum of your choice.

ndanielsen commented 5 years ago

I've been seeing a lot of good posts on this site: https://towardsdatascience.com/

It might be a great place to submit?

wagner2010 commented 5 years ago

Absolutely @ndanielsen . Additionally, an accompanying YouTube video blog or even podcast would be awesome as well!

dnabanita7 commented 5 years ago

can i be assigned this issue?

bbengfort commented 5 years ago

@Naba7 that would be great - looking forward to seeing a draft!

wagner2010 commented 5 years ago

@Naba7 thanks for taking an interest. As @bbengfort said, we would love to see a draft when you have it ready. Out of curiosity are you participating or applying to participate in Google's Summer of Code (GSoC) program? This is not certainly restricted to those who are participating in the GSoC however I am just wondering if you're looking to participate or if you just have an interest in writing a blogpost for Yellowbrick? Cheers!

dnabanita7 commented 5 years ago

I am participating in gsoc 2019

wagner2010 commented 5 years ago

Very cool. Thanks for reaching out. We don't have specific information on GSoC at this time but we definitely will after the process unfolds. Let's keep in touch and we're happy to check out a draft or hear about your ideas for a posting on a general basis.

richardjgowers commented 5 years ago

@wagner2010 Cool project! - I’ve done gsoc a few years now, I think the project has to be code based and not documentation. A code based blogpost is obviously borderline but I’d double check this

dnabanita7 commented 5 years ago

I have made a draft. I am pasting it below.Please check out and help me get clear through the errors and please specify if I left anything.Here is the link https://github.com/Naba7/NYPD_Hunchlab

wagner2010 commented 5 years ago

Hi Naba, thank you for your work. I and we (the team) don't really had the bandwidth to take up blogposts drafts at the moment. The purpose of my starting this issue was to include it on a list of issues for our GSoC proposal (for this summer) which is still going through the mentor organization application process. Blogposts are a low priority right now for us as we're trying to push forward on a number of higher priority issues that will propel us towards our next version bump (release). As you know, GSoC hasn't started yet and as I encouraged you a few weeks ago, I encourage you to go through the GSoC student application process. As a mentor organization, we want a chance to go through that application process for ourselves as it is still unfolding. In terms of this draft, due to the fact that we aren't able to review it for edits right now, you have two options: 1.) you can publish it yourself and let us know. We are most happy to Tweet/promote the blogpost. 2.) If you want it published on the District Data Labs blog site, you would need to get in touch with Tony Ojeda and work it out with him. And with that, I'm closing this issue for now.

dnabanita7 commented 5 years ago

Okay! Thanks a lot

On Wed 13 Feb, 2019, 9:09 AM wagner2010 <notifications@github.com wrote:

Closed #691 https://github.com/DistrictDataLabs/yellowbrick/issues/691.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/DistrictDataLabs/yellowbrick/issues/691#event-2135598019, or mute the thread https://github.com/notifications/unsubscribe-auth/AeGb92UPb59ycksRZ7oGr6JzmayqciXgks5vM4j6gaJpZM4aLI8y .

Yogayu commented 5 years ago

Dear Mentors @rebeccabilbro @bbengfort @lwgray @ndanielsen @pdamodaran @wagner2010,

I plan to work on "Allow ModelVisualizor to wrap pipeline objects" and this idea for GSoC. So, I'd like to describe how I am going to do this.

Idea: write blog posts about Data Science (including EDA and ML) by highlighting the usage of Yellowbrick.

Writing a blog is great to help users better use Yellowbrick, extend the project's influence, and increase community activity. Besides, I plan to translate the documents into Chinese.

Process I will write a blog highlighting Yellowbrick for a machine learning project. The basic process follows:

  1. Define the Goal and Audience for a post
  2. Design the machine learning task
  3. Choose the dataset
  4. Explore and design the data science pipeline: problem statement, hypothesis, ingestion, storage, wrangling, statistical exploration, model selection, machine learning, and visualization.
  5. Question and reflection

Theme

The themes I am going to cover is based on the document:

Platform

I'd like to discuss the Platform we choose to publish. I have a Blog (http://data2art.com, with a WeChat Office Account) already, which is one choice. So we can choose to publish on District Data Labs blog site, Medium platform or my blog. Of course, for my blog is the most convenient, the custom is also fast. By the way, it's worth considering that Medium platform can's be assessed in China without a VPN, like Google Site.

Translate the documents into Chinese Besides, I plan to translate the documents into Chinese which will greatly help increase the influence of the project in the Chinese community. I have already created a PR: "Fix some mistakes in quickstart.rst file and add Chinese translation to the tutorial.rst" which has been merged.

We all know that there is little time left for GSoC Proposal to be submitted. And I understand you are not available to respond in detail. So I will put those in my proposal and submit it. I hope that I will have a chance to discuss and work with you later.


Best wishes, Xinyu You