ireapps / pycar

NICAR Python mini boot camp
https://ireapps.github.io/pycar/pycar_intro.html
MIT License
101 stars 35 forks source link

PyCAR 15 feedback thread: Official session feedback or anecdotes... #17

Closed chrislkeller closed 8 years ago

chrislkeller commented 9 years ago

Dunno if there's a better way to catalog this - or if we even should - so figured an issue would work?

Data reporter with experience in R and SQL but not Python

My main thing is that even these basics are a lot to pack into a single day. Anything that helps me 1) commit syntax to memory or 2) gives me incentive to use python once I go back to my job is a good thing.

tommeagher commented 9 years ago

This is incredibly, helpful, folks. So if you have others, please put them here, anonymously, of course.

Ideally I’d use my own machine. I know this makes no sense from POV of instructors, but still.

I don't know about using your own machine. I get it, but I worry we'd spend a considerable amount of time at the beginning getting people set up that would kill any attempt at momentum. Any ideas how to solve that?

A handout with some of the basics would be nice. The code is on Github, yeah, but having that piece of paper makes a difference.

A handout is probably doable. What should be included in a one-pager?

I want to figure some of it out for myself, even if they’re just small things. A few moments where you’re like "now take a minute and import this library by yourself" will help me commit to memory the syntax.

This is a great point. Any opportunity we can have for people to repeat a concept on their own, helps with the learning.

We’re at such a basic level that it’s hard to figure out how I’m going to do amazing data work using python. The presenters could share projects they’ve worked on that python made possible. Not getting into the code, but just showing us why we should go back and keep using it.

This could be incorporated into the intro, if we can do it quickly.

Tell me why python is so awesome. What is easier in python than in R? Or excel even. We did go over some of this, but couldn’t repeat most it back to you.

I feel like we covered this a bit, but I suppose we could amplify it.

hbillings commented 9 years ago

On the "why is Python awesome" point, maybe we could add to the intro, where we talk about Python, a bit about why people choose to use a programming language over Excel or Access. This might lead well into talking about a project one of us has done where the data is cleaned/updated with Python, also hitting the "amazing data work" point. I think you really need only one project to drive the point home. (I'm thinking something like Crime in Chicagoland, where the site hits an API nightly for the data, processes it in Python on the server, and then updates the visualizations.)

aboutaaron commented 9 years ago

Here's some post-combat updates from the session @malev @rnagle and I led:

Once again, thank you for this repo. We would've been truly lost without it!

aboutaaron commented 9 years ago

Oh, and it looks like there were several pycar folders on the desktop so we ran into some issues were folks were in the wrong directories. For example, one computer had around four pycar directories and it took us a minute to figure out exactly which one was which.

I think if we fork the project, we'll definitely need to scope the projects to their days and instructors unless it's entirely the same class being taught.

chrislkeller commented 9 years ago

Thanks @aboutaaron... I'll add Ryan's code to this repo.

The Sublime Text 3 angle came up in a conversation I had with another participant. Like the bone-headed fool I am, at one point I had folks exit the python interpreter without giving them a chance to save their notes.

Is there a need to show some interpreter, but to echo Aaron I think any scripting can be done in ST3.

mikejcorey commented 9 years ago

Hi all: Thank you for doing this. This is beyond the scope of just PyCAR, but I did want to bring up a concern I had this year about Python generally at the conference. In short, there's a lot of it and it's kind of all over the place, though most of the material is excellent.

And I am totally not casting aspersions on any of these classes, their instructors or the idea of teaching Python. I'm primarily a Python programmer these days, and think it's an excellent first language and general language for journalism.

And this isn't a manifesto and I'm certainly not a Python expert. These are just some of my thoughts, and I'm wondering if anyone else is feeling the same.

There's a few separate things to discuss, IMHO:

Should there be fewer Python classes to offer space for other languages/topics?

Here's a list of all of the Python-focused sessions at this year's conference:

Compare this to JavaScript, for example. As far as I can tell there was only one class devoted specifically to learning JavaScript, and ~3 more that are heavily d3 or other JavaScript. Nothing particularly advanced on JS, and nothing on node.js (I don't know node, btw, I just know it is widely used in and out of journalism).

And I'm not making specific recommendations about what JS or other languages we should be teaching, either. I'm just hoping we can agree that we could make a little more space for some other topics if we got ourselves more coordinated.

Can we agree on a few levels of basic instruction (Python part 1, 2, 3, or whatever number) that have a related and linear curriculum?

There were Pythons part 1-3, but some other sessions also used terms like "intermediate" and "advanced", however, which made it somewhat confusing about where they fit in the progression, and as far as I know they weren't actually related to each other. I would suggest rebranding those as specialized (and highly worthwhile) topics: "Refactoring your code" or "object-oriented Python" and "Python for data analysis."

Can we standardize and optimize the tools and libraries that are used in teaching Python at NICAR?

I sat in on at least one Python session where at least the first half-hour was spent trying to get everyone's IPython Notebook up and running. Partly this was because the instructor was trying to accommodate people using their own laptops, but people taking the class had to learn how IPython Notebook acts before they could start learning Python. IPython Notebook is a great tool, but that doesn't strike me as the most efficient or necessary use of time.

There are other, lighter-weight options, like using the shell in the command line directly, IPython without the notebook (which would give learners more feedback than Python by itself on the command line), or using Sublime Text's built-in Python interpreter.

I don't want to be needlessly restrictive in how people teach, but it would be great if students could sit down in any of those classes and be able to use a familiar environment in any of them once they've taken any of the others.

What kind of structure would it take to set a common curriculum without losing the advantages of having many voices involved in the process?

Could we organize a (small) committee of Python folks to facilitate this? The group could take suggestions and feedback such as what's happening on this thread, post a recommendation, get some more feedback, and take a yea-nay vote? I hate committees as much as the next journalist, but Python has made a big impact on the NICAR community, and I don't want to ruin it by bigfooting ourselves into the rest of the conference.

End of spiel, you all are awesome, thank you.

chrislkeller commented 9 years ago

@mikejcorey Thank you for adding this in such a thoughtful way.

I had similar thoughts and discussions with other about this topic but could never boil things down the essence of what I was thinking in the manner you did.

I think my own stumbling point in thinking through this is mentioned in your last heading: without losing the advantages of having many voices involved in the process. The gorgeous thing about NICAR is someone can obtain knowledge and share it with others in a really direct way. No one wants to lose that. No one wants a curriculum committee to thumbs up or thumbs down sessions.

But I too felt those who want to learn would benefit from core concepts being taught, explained and demonstrated over and over - repetition, practice and muscle memory of you will. Landmarks, touchstones, repeated use of terms, doing and seeing are all valuable when it comes to learning.

And this speaks to the scope of what we did with PyCAR right? Devil's advocate: If there are two sessions in which participants will be scraping websites, did we need to do that during PyCAR?

There are a couple of issues tucked in here. Wonder if they should be separated out eventually so they don't get lose in the thread?

mikejcorey commented 9 years ago

Oh, sorry, I was unclear about the thumbs-up/down part. I meant that the committee would recommend the scope of each session, with extensive feedback, then the whole community would vote yes or no, we approve this standard.

chrislkeller commented 9 years ago

My blinders @mikejcorey... Read right over that and focused on the big type... Adding issue to focus discussion specifically to that.

rdmurphy commented 9 years ago

I think these are all great points @mikejcorey.

Too much Python?

I was a little overwhelmed by the sheer number of Python-focused courses NICAR offered this year. I assume some of this was the result of offering something comprehensive (like PyCAR) as a reserve-in-advance course, but not wanting to leave out the opportunity for others who don't do that to get a taste. It's a tough spot to be in, because ideally you don't offer nothing for all the people who can't/don't want to spend an entire day on one subject.

I think many of the specialized courses that deep dive on data analysis (and other things) could be done in any language, but the people who want to teach it use Python, so that's what gets taught. Almost a chicken/egg situation. I'm all for streamlining the offerings, though.

Other languages?

I had one on grunt, which really, that isn't interacting with node.js much except for needing to understanding how npm and node_modules works. I wish there were more. I heard some folks also voicing concern Ruby was underrepresented too.

There does seem to be a large swath of JavaScript education missing there. Either you're learning basics or being taught how to use a specific library. (All outside the scope of this, of course.)

Standardizing tools and libraries, techniques

Agreed. Even in PyCAR we clash a bit – the first lesson uses urllib.urlretrieve, then the third lesson uses requests. I was kinda surprised no one in our course asked why it suddenly changed.

Some of the Python courses used anaconda, others didn't. I feel like if we could standardize on a suite of tools (BeautifulSoup vs. PyQuery) that all the courses would build from, we could ensure consistency and that "drop in anywhere" ability.

esagara commented 9 years ago

I am going to weigh in here with a couple of thoughts.

tommeagher commented 9 years ago

Yes, thank you @mikejcorey, for putting these thoughts together.

I've been thinking along similar lines the last few days about the teaching of syntax rather than process at NICAR. Our aim with this class in particular has been to expand the knowledge of programming beyond news app development by teaching reporters how to program for reporting. For the reasons you cite, we chose Python to do it. But as I surveyed the schedule, I found too the vast array of offerings a bit daunting. And they all seem, my own included, to focus on the language itself rather than what we're doing with it.

Arguably, much of what we do in this class could be accomplished with Ruby or R or the command line or Google Spreadsheets or a dozen other tools. I wonder if we can create a framework for thinking about what we should be doing, along the lines of ETL, and use that framework for teaching data reporting.

For example, no matter what tool you use, you're going to want to clean and standardize strings and aggregate and sum records by category. We could teach you how to do that in the syntax of 10 or more different tools. The important part is not the syntax, but knowing how to break down a problem into these processes.

I agree that the session names were somewhat contradictory, but the 1, 2, 3 labels aren't really helpful either. Although, there is obvious overlap, there are distinctions to be made between what someone reporting a story needs to learn and what someone developing a site or an app needs to learn. Can we stream all of the programming classes into tracks for one of these, or is that counter-productive?

Sorry if this is a little muddled. I'm tapping it out on a train platform and still recovering from a sleepless Saturday night.

I'd like to continue these discussions and to get some input from @zstumgoren and @richardsalex, to start.

zstumgoren commented 9 years ago

Great comments so far folks, and very glad we're having this conversation.

I also was surprised at the motley of new Python courses and some of the overlaps (in name if not in content). We most definitely shouldn't drown out new voices by creating a Curriculum Committee, but given the growing popularity of Python in particular and programming-type classes in general, it seems like a good idea to do some strategic planning earlier in the year in collaboration with IRE/NICAR.

We did some of this in an informal fashion this year (e.g. @ghing and I chatted about how the Advanced Python class might pick up where our Python Intermediate class leaves off). But a more formal process that starts earlier in the year would help avoid confusion.

We also shouldn't limit the conversation to Pythonistas. It'd be great to bring in folks teaching other languages, especially on the beginner level, in order to learn from and coordinate our approaches.

That kind of cross-pollination might also help address @mikejcorey's concerns that Python has started crowding out other content. I'll admit I had a similar concern, but it was more generally about technical sessions at the expense of more traditional sessions on data analysis, storytelling and investigative technique (and the convergence of thereof).

I'm guessing the good folks at IRE/NICAR would be thrilled to have the geeks coordinate ahead of time, since it'd no doubt help with pulling together the conference schedule. Plus, they've developed their own Coding for Journalists bootcamp, so they might have feedback on what they'd like to see in certain classes.

I'd be happy to reach out to Mark Horvit and Jaimi Dowdell to get the conversation started, unless someone else has already jumped on it. Lmk.

hbillings commented 9 years ago

This touches on something I would love to see NICAR do (and something I'd be willing to help put together, as I've been kicking the idea around for a couple of years now): Intro to Programming Logic. A lot of the basic programming concepts transfer easily across languages. I believe it is totally possible to teach those in pseudocode instead of an actual language, with all of the hassle of system setup and syntax and scary-looking error messages. (My own personal aha moment was during a C++ class in college when I first learned what a loop was, and then realized I'd seen it in PHP and Javascript before.)

Benefits of teaching the logic include not only getting to spend that time on more language-specific training, but also /not needing a hands-on room/. You can talk about theory in front of a lot more people if you don't need each of them in front of a computer (and chances are they'll be paying more attention, too, without the distraction of typing into a terminal).

chrislkeller commented 9 years ago

I love this @hbillings... Absolutely love the idea of a pseudocode session.

esagara commented 9 years ago

@hbillings I actually pitched something very similar this year. Perhaps I should have done it as a lightning talk. The basis was if you know how to use Excel, you know how to program. The cell A1 is a variable. Column A is a list or array. =IF(condition, value if true, value if false) is a conditional. The comparisons can go on and on. I think we spend a lot of time teaching basic concepts that people should be able to easily recognize from prior experience. These are also the basics that will make learning the next (or first) language much easier.

hbillings commented 9 years ago

@esagara That's right! I remembered I talked to someone who had had the same idea, but I couldn't remember who it was. (I can't remember a lot of things, actually...)

tommeagher commented 9 years ago

This thread is fantastic, everyone. Thank you all for contributing to the discussion, even in spite of the lingering sleep deprivation, jet lag and conference hangovers.

I'd be interested to hear what folks like @knowtheory, @mattwaite, @palewire, @ryanpitts, and @joegermuska think about this idea of bringing some structure to teaching programming for journalism at NICAR and the possibility of standardizing courses within each language to make it easier for folks to navigate the array of offerings. Of course, as several others said above, we'd want to balance these discussions to ensure we continue to bring in new voices to teach and a spot for exciting new libraries.

@zstumgoren, I doubt Mark and Jaimi are monitoring this discussion, so if you want to reach out to them preliminarily and float the idea, I think that's great.

tommeagher commented 9 years ago

@esagara & @hbillings, I love the idea of pseudocode/logic class, which is similar to what I was suggesting above. If you're looking for another contributor, count me in.

zstumgoren commented 9 years ago

@tommeagher @hbillings @esagara @chrislkeller @rdmurphy & Co. -- just pinged Mark and Jaimi at IRE so let's see what they have to say. Meantime, definitely interested to hear feedback from other folks that Tom CC'd.

palewire commented 9 years ago

I think this is a great idea.

My two cents:

hbillings commented 9 years ago

On @palewire's last point: I think three people is a good number for a hands-on session. I was in one of the intro to D3 sessions, mostly to kibbutz, and it turned out to be really useful to the two presenters to have a third person to float around. Both of them would have been entirely tied up troubleshooting without someone else there (they didn't have a coach in the room).

richardsalex commented 9 years ago

First, just let me thank all of you not only for expending/investing so much time and energy in NICAR15’s hands-on classes, but also for leaping right into a thoughtful postmortem on your experiences without delay. This is a great conversation, and I’m glad it’s happening.

If I’m reading the thread correctly, it seems to condense around the following four questions:

I really want to huddle with @eklucas, Jaimi, Megan and Mark before I start offering my opinion on behalf of IRE and NICAR. For some questions, like the overwhelming prevalence of Python at NICAR15, I think IRE was responding to a certain extent to deeply unhappy Baltimore conference-goers who had been shut out of preregistered classes. Putting together the big grid is a balancing act of responding to schedule critique, the demand we see on the ground and our ability to engage volunteers to come and teach these classes.

To @hbillings’ and @palewire’s comment on the number of people in the room: I think our thought was that two or three would suffice; one person would be in front of the room teaching, generally, while the others would float around to deal with issues when attendees seemed to be getting stuck. This wasn’t always the case, and it’s probably worth our reiterating to hands-on teachers that taking turns in these roles (when there’s more than one teacher) can help a session run more smoothly.

tommeagher commented 9 years ago

@richardsalex, I think your summary of the issues we were wrestling with is a good one. Of course, we should say that all of this discussion is in the context of our unflagging support of IRE and our fellow investigative journalists. I thought you all did a bang-up job in organizing and executing the conference this year. We want to assist you in your work and to help find ways to better teach each other and ourselves.

We'll stand by to hear from you after you've had a chance to discuss this with Mark, Jaimi and everyone else. This thread has obviously moved beyond the scope of this specific repo, so we can move future discussions off of Github. Just let us know what's the best way for us to help.

PS - Thanks to @palewire for joining the conversation as well. Your input was much appreciated, Ben.

JoeGermuska commented 9 years ago

Coming in late, the one general bit of feedback I had from a few Knight Lab folks was disappointment with hands-on sessions that were predominantly "type what I type. now type this. now type this."

I know it can be hard to frame things more conceptually, but I thought I'd share it anyway. Maybe three hour classes is one way to aim higher. There was also a suggestion of mandatory pair programming, which could be scary, but which also could be awesome, especially if the class was more "here are some things to know. now solve this problem. GO!" and less lecture.

On Wed, Mar 11, 2015 at 1:57 PM, Tom Meagher notifications@github.com wrote:

@richardsalex https://github.com/richardsalex, I think your summary of the issues we were wrestling with is a good one. Of course, we should say that all of this discussion is in the context of our unflagging support of IRE and our fellow investigative journalists. I thought you all did a bang-up job in organizing and executing the conference this year. We want to assist you in your work and to help find ways to better teach each other and ourselves.

We'll stand by to hear from you after you've had a chance to discuss this with Mark, Jaimi and everyone else. This thread has obviously moved beyond the scope of this specific repo, so we can move future discussions off of Github. Just let us know what's the best way for us to help.

PS - Thanks to @palewire https://github.com/palewire for joining the conversation as well. Your input was much appreciated, Ben.

— Reply to this email directly or view it on GitHub https://github.com/ireapps/pycar/issues/17#issuecomment-78345886.

Joe Germuska Joe@Germuska.com * http://blog.germuska.com

"I felt so good I told the leader how to follow." -- Sly Stone

ghing commented 9 years ago

Joe,

I really tried to do this with my session. Do you know if any of your students went? I'd be interested in their feedback if they did.

I tried to structure my session like:

Concept Example of concept from source code of a journalism-related open source project Exercise

You can see the slide deck at http://ghing.github.io/nicar2015_advanced_python_slides/ get a sense of what this looked like.

I also had people tweet me their exercise solutions so I could throw them up on the projector and talk about them with the rest of the class. This was a little challenging given the pace of the class and it being the end of the day, but definitely something worth experimenting with in the future.

Best, Geoff

On Fri, Mar 13, 2015 at 9:48 AM, Joe Germuska notifications@github.com wrote:

Coming in late, the one general bit of feedback I had from a few Knight Lab folks was disappointment with hands-on sessions that were predominantly "type what I type. now type this. now type this."

I know it can be hard to frame things more conceptually, but I thought I'd share it anyway. Maybe three hour classes is one way to aim higher. There was also a suggestion of mandatory pair programming, which could be scary, but which also could be awesome, especially if the class was more "here are some things to know. now solve this problem. GO!" and less lecture.

On Wed, Mar 11, 2015 at 1:57 PM, Tom Meagher notifications@github.com wrote:

@richardsalex https://github.com/richardsalex, I think your summary of the issues we were wrestling with is a good one. Of course, we should say that all of this discussion is in the context of our unflagging support of IRE and our fellow investigative journalists. I thought you all did a bang-up job in organizing and executing the conference this year. We want to assist you in your work and to help find ways to better teach each other and ourselves.

We'll stand by to hear from you after you've had a chance to discuss this with Mark, Jaimi and everyone else. This thread has obviously moved beyond the scope of this specific repo, so we can move future discussions off of Github. Just let us know what's the best way for us to help.

PS - Thanks to @palewire https://github.com/palewire for joining the conversation as well. Your input was much appreciated, Ben.

— Reply to this email directly or view it on GitHub https://github.com/ireapps/pycar/issues/17#issuecomment-78345886.

Joe Germuska Joe@Germuska.com * http://blog.germuska.com

"I felt so good I told the leader how to follow." -- Sly Stone

— Reply to this email directly or view it on GitHub https://github.com/ireapps/pycar/issues/17#issuecomment-79007096.

Geoffrey Hing geoffhing@gmail.com 773.969.6436 http://geoff.terrorware.com/ Twitter: @geoffhing

hbillings commented 9 years ago

I think this depends on the target audience of the class, too. For some of the advanced sessions it makes sense, but in an intro class there has to be some level of "type this," at least at first. If it's a longer session, like ours was, I think we could definitely afford to throw more problem-solving at the students in the second half. And, this might be mitigated by spending more time up front talking about concepts like what whitespace means to Python, what a datatype is, etc., before we jump into coding.

zstumgoren commented 9 years ago

+1 @JoeGermuska We applied a teach -> practice -> code review cycle for the Python Intermediate class and (I think) it went really well during the first half of the day.

It was a tougher slog in the second half of the day, however, when we moved on to more difficult/foreign material, namely object oriented programming and designing programs with classes. We tried to take the temperature of the room, and most seemed to want us to do more "type this"-style teaching or simply review the final code implementation and explain how we got there and the rationale.

Part of me suspects that the day was simply too long, especially to introduce complicated new material at the end of day. @esagara suggested splitting a full-day class over the course of two days, and I'm starting to think that may generally be the wiser course of action. We're throwing a lot of at these folks, whatever their level, and I think they may need time to digest and recuperate before diving headlong into the next subject. I'd be interested to hear from folks who taught two half-day sessions to see how things went.

esagara commented 9 years ago

One other thing I would like to point out. There were several times where @hbillings and I were looking for a whiteboard or large sheet of paper so we could quickly sketch out concepts. Adding whiteboards or easels with paper would be a great help I think at least in the intro class. It could also be used to build more interaction between students and teachers.

JoeGermuska commented 9 years ago

So I came back and read the entire Git thread (I came in only when @tommeagher tagged me)... Some more thoughts:

Don't apologize about a lot of python classes. The simple fact is that if you want to string together a chain of classes of increasing sophistication, you need to focus.

@hbillings mentioned something about explaining data cleaning practices and why to use code instead of Excel: there were at least two presentations about that that were not python specific ("Do It Once" by Derek Willis & David Eads, and another from Chris Groskopf & Paul Overberg). They just didn't have "python" in the name ;-)

knowtheory commented 9 years ago

Hey. This discussion is awesome.

I'm glad that there's an interest/focus on capabilities rather than fixation on languages. There are a couple other examples of curricula built like this, most notably Zed Shaw's "Learn {Ruby,Python,C,SQL} the Hard Way" series.

There will always be a challenge for any bootcamp style instruction between keeping everyone on the path while giving people enough free rei(g)n to explore. IRE's other bootcamps do this through a bit of free time during the day, some structured exercises and (importantly) lab time after the day's instruction with coaches who hang around to help out.

To @serdar's point... psych & pedagogical literature all recommends shorter distributed practice over a longer span of days, rather than big blocks of time on fewer days.

As a general matter i think we need to talk more about the devops aspects of how we instruct and bootstrap students especially with regard to the environments they're going back to. It's well and good if you can get folks hooked up and excited about programming on a computer we control and have provisioned, but if they're sent back to their newsroom and are totally out to sea when setting up their own environment... that's a major bummer.

I think it'd be worth talking about what the most straightforward and bulletproof path is to installing a self-contained and partitioned programming environment. I'm particularly interested in environments that can be snapshotted and reset after some of the messes we encountered during NICAR. So if anyone has thought about using Docker or Vagrant as teaching environments i'd love to hear about it! (IRE has some peculiar interests in this, as we encountered some snafus between classes, like instructors getting students to set the global git config for the machines they were on, which fubar'd subsequent classes which also required use of github)


An aside pertaining to the "there's so much python" thing:

My personal opinion is that it's really important to focus on capabilities rather than languages (and i agree w/ @joegermuska's comment that it's important to be focused & consistent). I'd love to see a skills based curriculum that could then be implemented in whatever language folks had an interest in teaching.

As someone running a project that's built in Ruby/JS, we already have a pipeline problem from residing in a community that defaults to Python. I've given up on taking Mizzou students for DocumentCloud, as basically all the students i've talked to are focused on learning to program in Python (or sometimes JS). I'm not interested in proselytizing (and wouldn't have time to even if i were), and if they feel like Python's the thing they need for their career, i'm more than happy to level up their skills by debugging their python scripts and give them pointers, but it essentially means that DocumentCloud is relegated to being an advanced news developer or non-news developer only project (and this makes me very deeply sad).

tommeagher commented 8 years ago

Closed by #26.

Alex hinted at this a few days ago, but it's worth mentioning that all of this feedback was really helpful as we've been thinking about the hands-on sessions at this year's conference. We're trying to use many of the ideas from this discussion to make things even better this year.

Thanks again for all of your feedback and comments. Please feel welcome to continue to contribute to the open issues (or open new ones) as we resume planning this year's iteration of PyCAR.