john-science / python_for_scientists

Python Open Courseware for Scientists and Engineers
GNU General Public License v3.0
68 stars 40 forks source link

Pandas Lecture #22

Closed john-science closed 5 years ago

john-science commented 9 years ago

Pandas is a data analysis tool, designed to corral datasets and make doing statistics easier. We need a solid overview of the library, but also a finite and select number of topics we can cover well in an hour

john-science commented 5 years ago

I just tried working through the examples in this lecture and had several problems.

I think all the code looks good, but the example DataFrames in the lecture aren't all created before you use them. So, while the code is good, a student can't work through the examples.

In particular the DataFrame with "FIRST_NAME" and "LAST_NAME" columns is used in several examples, but never created. So the student can't work through these examples themselves.

john-science commented 5 years ago

Is it weird that this lecture is Pandas AND Jupyter?

They are frequently used together.

john-science commented 5 years ago

There was a request to show re-sampling and down-sampling for Pandas DataFrames. Is that a reasonable request? Or is that just using the tools that are already in the lecture?

maybuhleen commented 5 years ago

Hi! Sorry I've been absent in this thread! Hari left last September for RD and we still haven't back-filled his position, which means we've had our hands tied up. Do you need something from me to address this? I'll help in any way I can.

Thanks, Maybelline

On Wed, Feb 20, 2019 at 7:09 PM John Stilley notifications@github.com wrote:

Closed #22 https://github.com/theJollySin/python_for_scientists/issues/22 via 10d454c https://github.com/theJollySin/python_for_scientists/commit/10d454c7f5c220b9914995f5a2cb1d6175958bd8 .

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/theJollySin/python_for_scientists/issues/22#event-2153796695, or mute the thread https://github.com/notifications/unsubscribe-auth/AJ3TQC04caBfE67BeOWJdcEGItCb_g4Gks5vPg3igaJpZM4Et8HU .

john-science commented 5 years ago

Ah! Hello!

Sorry to hear you've been slammed. It did seem like Hari was ready to be done with that awful database.

So far all I've done is a little rearranging and I added a small section on "df.apply()".

Good timing though!

I actually just gave this lecture today. My students griped a little bit that there is a big DataFrame that jumps out of nowhere mid lecture. (The "hair color" / "eye color" DataFrame.) Someone (probably me) hacked the lecture in the past two years and we are missing the lines of code where we create that example DataFrame.

I need to go through the Git history and find that, or generate it myself, or something. I want students to be able to play around with all the examples themselves, you know? That's my main concern.

Though... I keep wanting to add more material to this. Because Pandas is so big. But... is the lecture too big already? What do you think? I spent a solid hour today and didn't get through it all. I can't decide. Maybe that's okay.

maybuhleen commented 5 years ago

Ah. gotcha. Let me review it and see if I have a solution for it.

On Thu, Feb 21, 2019 at 1:29 PM John Stilley notifications@github.com wrote:

Ah! Hello!

Sorry to hear you've been slammed. It did seem like Hari was ready to be done with that awful database.

So far all I've done is a little rearranging and I added a small section on "df.apply()".

Good timing though!

I actually just gave this lecture today. My students griped a little bit that there is a big DataFrame that jumps out of nowhere mid lecture. (The "hair color" / "eye color" DataFrame.) Someone (probably me) hacked the lecture in the past two years and we are missing the lines of code where we create that example DataFrame.

I need to go through the Git history and find that, or generate it myself, or something. I want students to be able to play around with all the examples themselves, you know?

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/theJollySin/python_for_scientists/issues/22#issuecomment-466174294, or mute the thread https://github.com/notifications/unsubscribe-auth/AJ3TQGUb1IW5L92A8Wo7Y84K04yOGbTVks5vPw_OgaJpZM4Et8HU .

maybuhleen commented 5 years ago

If you're talking about the "Slicing a DataFrame" section, then you're right--that IS odd that it comes out of nowhere. It looks like data from the "client_list.csv" file. I think you're just missing:

df = pd.read_csv("client_list.csv")

Hope that helps. Come bug me if you have more questions or issues with this lecture. I'm hoping to add more to it fairly soon, but with interviews for Hari's replacement happening next week, it'll likely be more like two weeks out before I can make the improvements to this lecture.

--Maybelline

On Thu, Feb 21, 2019 at 1:33 PM Maybelline Disuanco maybuhleen@gmail.com wrote:

Ah. gotcha. Let me review it and see if I have a solution for it.

On Thu, Feb 21, 2019 at 1:29 PM John Stilley notifications@github.com wrote:

Ah! Hello!

Sorry to hear you've been slammed. It did seem like Hari was ready to be done with that awful database.

So far all I've done is a little rearranging and I added a small section on "df.apply()".

Good timing though!

I actually just gave this lecture today. My students griped a little bit that there is a big DataFrame that jumps out of nowhere mid lecture. (The "hair color" / "eye color" DataFrame.) Someone (probably me) hacked the lecture in the past two years and we are missing the lines of code where we create that example DataFrame.

I need to go through the Git history and find that, or generate it myself, or something. I want students to be able to play around with all the examples themselves, you know?

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/theJollySin/python_for_scientists/issues/22#issuecomment-466174294, or mute the thread https://github.com/notifications/unsubscribe-auth/AJ3TQGUb1IW5L92A8Wo7Y84K04yOGbTVks5vPw_OgaJpZM4Et8HU .

john-science commented 5 years ago

Thanks again, @maybuhleen ! I think that helps the lecture clarity a lot.