Open rpruim opened 7 years ago
I've added the report to the github repo (see https://github.com/ProjectMOSAIC/mosaicData/blob/master/data-raw/PVPCcounts2005.pdf).
Note that I also merged beta and master to do this.
I've added the new variable and recoded the old one:
glimpse(RailTrail)
## Observations: 90
## Variables: 11
## $ hightemp <int> 83, 73, 74, 95, 44, 69, 66, 66, 80, 79, 78, 65, 41, 59, 50, 54, 97, 75, 63, ...
## $ lowtemp <int> 50, 49, 52, 61, 52, 54, 39, 38, 55, 45, 55, 48, 49, 35, 35, 32, 71, 43, 35, ...
## $ avgtemp <dbl> 66.5, 61.0, 63.0, 78.0, 48.0, 61.5, 52.5, 52.0, 67.5, 62.0, 66.5, 56.5, 45.0...
## $ spring <int> 0, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1, 1, 0, 1, 0...
## $ summer <int> 1, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1...
## $ fall <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0...
## $ cloudcover <dbl> 7.6, 6.3, 7.5, 2.6, 10.0, 6.6, 2.4, 0.0, 3.8, 4.1, 8.5, 7.2, 10.0, 7.7, 5.8,...
## $ precip <dbl> 0.00, 0.29, 0.32, 0.00, 0.14, 0.02, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.03...
## $ volume <int> 501, 419, 397, 385, 200, 375, 417, 629, 533, 547, 432, 418, 193, 331, 280, 3...
## $ weekday <lgl> TRUE, TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, FALSE, FALSE, TRUE, TRUE, TRUE, T...
## $ dayType <chr> "weekday", "weekday", "weekday", "weekend", "weekday", "weekday", "weekday",...
I don't think the data in the PDF Nick posted matches the data in RailTrail
. Perhaps Nick can take a look and see.
RailTrail.R
in data-raw
creates a data set that matches the PDF, but I don't immediately see how the two data sets line up. There is a lot of similar data, but it doesn't match perfectly, and if these really are based on the same data, I don't see how the RailTrail
data is ordered.
Indeed: this is atrocious. I've gone through to try to reconcile the two forms and there are clearly a number of errors. I'm very embarrassed. This dataset stemmed from a student project in a class and I didn't check that they had correctly ingested it. Groan.
I'll need to take a closer look at this and get things reconciled.
If you believe the PDF, I have already converted that into a data frame and we could simply do a replace. Some of the additional variables (spring, for example) could be computed from the dates.
My suggestion would be to leave this issue open for now and work on an updated dataset that could be released in the late spring. I'll take the lead on this (as it will involve some cleanup of our existing examples). Again, my apologies for letting this error creep in.
On Jan 10, 2017, at 7:53 PM, Randall Pruim notifications@github.com wrote:
If you believe the PDF, I have already converted that into a data frame and we could simply do a replace. Some of the additional variables (spring, for example) could be computed from the dates.
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub, or mute the thread.
Nicholas Horton Professor of Statistics Department of Mathematics and Statistics, Amherst College PO Box 5000, AC #2239 Amherst, MA 01002-5000
@nicholasjhorton, any updates on this data situation?
To do:
Also fix bad URL in documentation.