ecortens / cirpa-r-workshop-2018

2 stars 2 forks source link

Find some potential datasets for use in the workshop WIP #4

Open sechilds opened 6 years ago

sechilds commented 6 years ago

From our conversation, we are not as concerned with finding IR-related datasets - but rather ones that will work well for our purposes. (Mostly around the type of variables they have).

This issue is going to be my place to talk about some datasets that we can use. As per Evan's suggestion, I have been looking at the UC Irvine Machine Learning Repository which has a large number of datasets organized by domain and kind of problem.

One potential dataset is the Bike Sharing Dataset of Bike rentals in Washington DC. It has calendar and weather information. and could be used to look at differences in the number of rentals by the characteristics of the day.

sechilds commented 6 years ago

This Absenteeism at work might be interesting as well.

The database was created with records of absenteeism at work from July 2007 to July 2010 at a courier company in Brazil.