ropensci / bikedata

:bike: Extract data from public hire bicycle systems
https://docs.ropensci.org/bikedata
81 stars 16 forks source link

Add sf cpp boston parsing #64

Closed tbuckl closed 6 years ago

tbuckl commented 6 years ago

hey @mpadge this seems like a solution/addition for sf with a potential bug introduction. totally understand that you'll want to understand the bug below more before merge.

Solution

You can do the following now:

library(bikedata)
data_dir <- tempdir()
bikedb <- file.path (data_dir, 'bikedb')
dl_bikedata (city = 'la', data_dir = data_dir)
dl_bikedata (city = 'sf', data_dir = data_dir)
store_bikedata (data_dir = data_dir, bikedb = bikedb, city='la')
store_bikedata (data_dir = data_dir, bikedb = bikedb, city='sf')

and you get stations and trips for la.

you can also then load other cities e.g. boston:

dl_bikedata (city = 'bo', data_dir = data_dir)
store_bikedata (data_dir = data_dir, bikedb = bikedb, city='bo')

Bug

unfortunately, i did notice that if you do this:

data_dir <- tempdir()
bikedb <- file.path (data_dir, 'bikedb')
dl_bikedata (city = 'bo', data_dir = data_dir)
dl_bikedata (city = 'sf', data_dir = data_dir)
store_bikedata (data_dir = data_dir, bikedb = bikedb, city='sf')

Then store bikedata starts to try to load 11 CSV files. There are only 4 for 'sf' so far.

Then the R session crashes. It seems that store_bikedata looks for all CSV files.

I guess there are a few options here:

1) fix the way that store_bikedata works on CSV's? 2) tell people specifically not to download a bunch of csv's for 2 providers and then just process 1?

probably (1) is ideal. i'll look into it a bit.

tbuckl commented 6 years ago

closing until tests pass

tbuckl commented 6 years ago

hey @mpadge i ran tests on this locally and they all passed.

i'm reopening this because this works now:

library (bikedata)
store_bikedata (city = 'sf', bikedb = 'bikedb')
tm <- bike_tripmat (bikedb = 'bikedb', city='sf')
dim (tm); format (sum (tm), big.mark = ',')
bike_summary_stats (bikedb = 'bikedb')
daily_trips_sf <- bike_daily_trips (bikedb = 'bikedb', city = 'sf')

library(ggplot2)
qplot(x=date, y=numtrips,
      data=daily_trips_sf, na.rm=TRUE,
      main="Daily trips for SF Bikeshare (2017-2018)",
      xlab="Date", ylab="Number of Trips")+geom_line() 

and that use seems useful/consistent enough with the docs. but what do you think?

tbuckl commented 6 years ago

it could be that dl_bikedata just shouldn't be exported?

mpadge commented 6 years ago

That looks great - i'll have time tomorrow to go through it and merge. Thanks a load in advance for the great work!

mpadge commented 6 years ago

@tibbl35 Can you please do another PR in which you add your name to the authors list? Just copy and modify this line, and use role = "aut" :+1: