awong234 / laundry-day

Data cleaning projects for the Cleveland R User Group
https://awong234.github.io/laundry-day/
1 stars 1 forks source link

Person holding a laundry basket full of computer parts

Saturday morning is a good time to do laundry.

It's common knowledge that much of data analytical work comes in the form of cleaning data; that is to say, acquiring and molding data into a conventional format ready to be used by data analytical tools.

Sometimes it's easier -- scrape an HTML table using rvest::read_html() |> rvest::html_table() and you might get something ready to use. Sometimes it's harder, perhaps you're parsing text via OCR out of images and need to format it properly.

In these sessions, we'll take messy data and wrangle them into cleaner formats. We focus on the cleaning portion because a) the result is more objective and b) everyone needs to do it and can learn from other's styles.

Some of the events are documented here