Closed john-science closed 5 years ago
I just tried working through the examples in this lecture and had several problems.
I think all the code looks good, but the example DataFrames in the lecture aren't all created before you use them. So, while the code is good, a student can't work through the examples.
In particular the DataFrame with "FIRST_NAME" and "LAST_NAME" columns is used in several examples, but never created. So the student can't work through these examples themselves.
Is it weird that this lecture is Pandas AND Jupyter?
They are frequently used together.
There was a request to show re-sampling and down-sampling for Pandas DataFrames. Is that a reasonable request? Or is that just using the tools that are already in the lecture?
Hi! Sorry I've been absent in this thread! Hari left last September for RD and we still haven't back-filled his position, which means we've had our hands tied up. Do you need something from me to address this? I'll help in any way I can.
Thanks, Maybelline
On Wed, Feb 20, 2019 at 7:09 PM John Stilley notifications@github.com wrote:
Closed #22 https://github.com/theJollySin/python_for_scientists/issues/22 via 10d454c https://github.com/theJollySin/python_for_scientists/commit/10d454c7f5c220b9914995f5a2cb1d6175958bd8 .
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/theJollySin/python_for_scientists/issues/22#event-2153796695, or mute the thread https://github.com/notifications/unsubscribe-auth/AJ3TQC04caBfE67BeOWJdcEGItCb_g4Gks5vPg3igaJpZM4Et8HU .
Ah! Hello!
Sorry to hear you've been slammed. It did seem like Hari was ready to be done with that awful database.
So far all I've done is a little rearranging and I added a small section on "df.apply()".
Good timing though!
I actually just gave this lecture today. My students griped a little bit that there is a big DataFrame that jumps out of nowhere mid lecture. (The "hair color" / "eye color" DataFrame.) Someone (probably me) hacked the lecture in the past two years and we are missing the lines of code where we create that example DataFrame.
I need to go through the Git history and find that, or generate it myself, or something. I want students to be able to play around with all the examples themselves, you know? That's my main concern.
Though... I keep wanting to add more material to this. Because Pandas is so big. But... is the lecture too big already? What do you think? I spent a solid hour today and didn't get through it all. I can't decide. Maybe that's okay.
Ah. gotcha. Let me review it and see if I have a solution for it.
On Thu, Feb 21, 2019 at 1:29 PM John Stilley notifications@github.com wrote:
Ah! Hello!
Sorry to hear you've been slammed. It did seem like Hari was ready to be done with that awful database.
So far all I've done is a little rearranging and I added a small section on "df.apply()".
Good timing though!
I actually just gave this lecture today. My students griped a little bit that there is a big DataFrame that jumps out of nowhere mid lecture. (The "hair color" / "eye color" DataFrame.) Someone (probably me) hacked the lecture in the past two years and we are missing the lines of code where we create that example DataFrame.
I need to go through the Git history and find that, or generate it myself, or something. I want students to be able to play around with all the examples themselves, you know?
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/theJollySin/python_for_scientists/issues/22#issuecomment-466174294, or mute the thread https://github.com/notifications/unsubscribe-auth/AJ3TQGUb1IW5L92A8Wo7Y84K04yOGbTVks5vPw_OgaJpZM4Et8HU .
If you're talking about the "Slicing a DataFrame" section, then you're right--that IS odd that it comes out of nowhere. It looks like data from the "client_list.csv" file. I think you're just missing:
df = pd.read_csv("client_list.csv")
Hope that helps. Come bug me if you have more questions or issues with this lecture. I'm hoping to add more to it fairly soon, but with interviews for Hari's replacement happening next week, it'll likely be more like two weeks out before I can make the improvements to this lecture.
--Maybelline
On Thu, Feb 21, 2019 at 1:33 PM Maybelline Disuanco maybuhleen@gmail.com wrote:
Ah. gotcha. Let me review it and see if I have a solution for it.
On Thu, Feb 21, 2019 at 1:29 PM John Stilley notifications@github.com wrote:
Ah! Hello!
Sorry to hear you've been slammed. It did seem like Hari was ready to be done with that awful database.
So far all I've done is a little rearranging and I added a small section on "df.apply()".
Good timing though!
I actually just gave this lecture today. My students griped a little bit that there is a big DataFrame that jumps out of nowhere mid lecture. (The "hair color" / "eye color" DataFrame.) Someone (probably me) hacked the lecture in the past two years and we are missing the lines of code where we create that example DataFrame.
I need to go through the Git history and find that, or generate it myself, or something. I want students to be able to play around with all the examples themselves, you know?
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/theJollySin/python_for_scientists/issues/22#issuecomment-466174294, or mute the thread https://github.com/notifications/unsubscribe-auth/AJ3TQGUb1IW5L92A8Wo7Y84K04yOGbTVks5vPw_OgaJpZM4Et8HU .
Thanks again, @maybuhleen ! I think that helps the lecture clarity a lot.
Pandas is a data analysis tool, designed to corral datasets and make doing statistics easier. We need a solid overview of the library, but also a finite and select number of topics we can cover well in an hour