pydatadelhi / talks

Talks at PyData Delhi Meetups
44 stars 13 forks source link

[Proposal]: Exploratory Data Analysis #68

Open pranavsuri opened 6 years ago

pranavsuri commented 6 years ago
shagunsodhani commented 6 years ago

Hey, @pranavsuri Thanks for the proposal. What data sets do you plan on using for the talk? From the current proposal, the talk seems to be more about ggplot and less about EDA (which is fine imo, we would just have to reword it a bit). Maybe you could add a short section on how ggplot compares with alternatives.

I haven't used ggplot much but I think it is closely tied to pandas so it might be a good idea to keep a buffer of say 5 minutes when you plan the outline of the talk in case someone in the audience is not familiar with it. I am not saying that you must talk about pandas but it would be helpful if you keep this possibility in mind while planning the time-wise break up of the talk.

Looking forward to the slides.

pranavsuri commented 6 years ago

Hi, @shagunsodhani! I plan to use two popular datasets – Wine Quality Dataset & the Enron Email Dataset (a curated version of the original dataset). The former would be used for most of the talk while the latter would be used to show 'outlier removal.'

You might be right about the rewording. I plan to talk more on EDA & less about the tool one intends to use (be it ggplot2 in R or matplotlib/seaborn in Python). I don't want to make it syntax focussed as often when creating visualizations; most people search online or refer the documentation to get the right plot.

ggplot2 is a library in R, and I agree that keeping a buffer might be necessary but as I said, I don't want to make it centered around the syntax to create the plots but more about, "how to use plots to understand the data?"

I'd prepare the slides in a day (or two at max). Meanwhile, you can read the notebooks or a blog post of mine which includes a part of the talk.

shagunsodhani commented 6 years ago

Ahh my apologies, I thought you planned on using ggplot(the python wrapper) and not ggplot2(the R library). I am not sure how many people would know of R syntax so even with minimal syntax, it might be very new for people. I will let @manojpandey pitch more on this so that you can tweak your slides/content accordingly.

Dawny33 commented 6 years ago

Nice. We always need an EDA talk. :D

It would be great if you can use language (R in your case) just as a medium for explaining concepts, instead of making the talk totally language/framework - dependent.

pranavsuri commented 6 years ago

Hi @Dawny33! That's how I've planned the whole talk. 😀

I'd start making the slides on Google Slides and share the link here. I think, then the discussions and feedback could proceed better. While I prepare PPT, we all can continue the existing conversation.

manojpandey commented 6 years ago

Hi @pranavsuri - any update on slides? We're having the meetup this Saturday!

pranavsuri commented 6 years ago

Hi @manojpandey! Sorry for the late reply. I'd complete the slides by Thursday afternoon for sure. You can view the slides on this link. Let me know if I should make some changes.

pranavsuri commented 6 years ago

Update: I have completed the slides. You can view them at this link.

manojpandey commented 6 years ago

Hey @pranavsuri - LGTM, will you also show the code with the examples?

btw, the meetup is tomorrow at 11pm, and this will be the first talk. Please ping me or @mananpal1997 on telegram to confirm!

pranavsuri commented 6 years ago

Hi! I wasn't sure about including code in the PPT because per plot the code is around 5-6 arguments in the function and it might be tough to grasp for the audience through the slides. Although, I can add a small section on ggplot2 to show how most plots were created. What do you say?

manojpandey commented 6 years ago

Ah, not in the slides, but you can show it separately. Yeah - a small primer on ggplot2 would be good too - although not forced, as most of the folks would be having JUST python experience ;) - but feel free to !

pranavsuri commented 6 years ago

Okay! 😊Minor edits to make; can manage that. 🤘🏼