AllenDowney / ThinkStats2

Text and supporting code for Think Stats, 2nd Edition
http://allendowney.github.io/ThinkStats2/
GNU General Public License v3.0
4.02k stars 11.28k forks source link

Utility of Exercise 1.2 #119

Closed julienstark closed 2 years ago

julienstark commented 6 years ago

The book is great and concepts are well explained, however I struggle to understand the educational value of exercise 1.2, which basically asks us to write a function that reads the respondent file, 2002FemResp.dat.gz.

This is, more or less, a copy-paste of file nsfg.py that we already had the opportunity to review in the introduction notes. One can argue that we shouldn't review this nsfg file before writing functions for this exercise, but the "ReadFemResp" function is actually quite hard to design without any hints. Not because of the body of the function itself, but because of the dependencies on thinkstat2, which invokes some heavy parsing function.

So I'm still a bit confused about the purpose of this exercise. Am I supposed to write the program while reviewing nsfg.py (in this case, I just literally need to copy paste huge portion of the code) ? Or should I design the functions from scratch ? (which will take me a lot of time reviewing available class and methods in thinkstats2.py and try to understand how the dict file is parsed... Something which seems to be a bit too overwhelming for a chapter 1 exercise targeted to people with no prior experience with Pandas...)

Should additional information be provided or am I, most likely, missing something ?

AllenDowney commented 6 years ago

That is not a great exercise. I will consider cutting it.

Thanks!

On Tue, Aug 28, 2018 at 1:12 AM, Julien notifications@github.com wrote:

The book is great and concepts are well explained, however I struggle to understand the educational value of exercise 1.2, which basically asks us to write a function that reads the respondent file, 2002FemResp.dat.gz.

This is, more or less, a copy-paste of file nsfg.py that we already had the opportunity to review in the introduction notes. One can argue that we shouldn't review this nsfg file before writing functions for this exercise, but the "ReadFemResp" function is actually quite hard to design without any hints. Not because of the body of the function itself, but because of the dependencies on thinkstat2, which invokes some heavy parsing function.

So I'm still a bit confused about the purpose of this exercise. Am I supposed to write the program while reviewing nsfg.py (in this case, I just literally need to copy paste huge portion of the code) ? Or should I design the functions from scratch ? (which will take me a lot of time reviewing available class and methods in thinkstats2.py and try to understand how the dict file is parsed... Something which seems to be a bit too overwhelming for a chapter 1 exercise targeted to people with no prior experience with Pandas...)

Should additional information be provided or am I, most likely, missing something ?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/AllenDowney/ThinkStats2/issues/119, or mute the thread https://github.com/notifications/unsubscribe-auth/ABy37a13JoemvWG1RkbukQHvFejyqT-2ks5uVNFbgaJpZM4WO7Rt .

resonates7 commented 3 years ago

I got a lot out of this exercise. Although it is somewhat of a repeat, it forced me to think things through and helped me put a lot of pieces together.

I'm new to data science, so it may not be helpful for more experienced folks. But, I found it very useful.