datacarpentry / openrefine-socialsci

OpenRefine for Social Science Data
https://datacarpentry.org/openrefine-socialsci/
Other
23 stars 46 forks source link

regular expression lesson #19

Closed tracykteal closed 6 years ago

tracykteal commented 6 years ago

Regular expressions are very valuable and it can be a good introduction to them in OpenRefine. However, regular expressions have a very high cognitive load, as they're a distinctly new concept. So, we would want to spend enough time with them for it to be useful, and that would add to the estimated time of this module. Given where OpenRefine is in the workshop, and that it's an overall start to working with data, my initial thought is that it would be more confusing than powerful as a concept at this stage. So, it would be best not to include it, at least in the original release of these lessons.

Are there thoughts on the regular expression lesson and whether or not to include it in this release?

kevin-vilbig commented 6 years ago

I didn't really get regex until I learned about state machines. That took more time and effort than we can do during the OpenRefine lessons for sure and adds levels of abstraction that are too much to dump on a beginner in a day. That goes double because often OpenRefine can take some time to install and get running properly and as much as we would like to have the perfect classroom of ready participants... we often don't and so can be pressed for time on this segment compared to the other DC lessons.

More generally about regex, they are one of the more tricky concepts for people who are not accustomed to thinking in and about pedantic computer-style processing on strings of characters, which is one of the fundamental programmer-centric cognitive tools that we teach. Things that seasoned programmers take for granted, like escape characters, can be difficult to understand and even more difficult for beginners to keep straight in practice.

fmichonneau commented 6 years ago

Overall I agree. I think the just brushing up on regular expressions might introduce more confusion that it would be helpful.

Maybe we can keep the exercise of filtering, but instead of doing with regular expressions, we could parse the date into its components, and filter on the month?

gtlaflair commented 6 years ago

I like the idea of filtering on month. When I went through and completed this lesson a couple of weeks ago, it didn't feel like it dug into regex deep enough to really see how useful they can be.

ErinBecker commented 6 years ago

Addressed with https://github.com/datacarpentry/openrefine-socialsci/pull/23

tracykteal commented 6 years ago

Consensus sounded like we should remove this for now, given the challenges of introducing the topic and its relevance in this particular example. It's been removed in #23, so I'll close this issue and we can open a new issue if we want to create a new regex lesson.