ropensci / stats19

R package for working with open road traffic casualty data from Great Britain
https://docs.ropensci.org/stats19
GNU General Public License v3.0
61 stars 19 forks source link

Downloading vehicles from 2020 #237

Open BlaiseKelly opened 3 months ago

BlaiseKelly commented 3 months ago

This is a reopening of issue 216 https://github.com/ropensci/stats19/issues/216 which had a work around but didn't appear to be fixed.

vehicles = get_stats19(year = 2020, type = "vehicle", ask = FALSE, format = TRUE)

It seems to be related to the locate_file functions which are difficult to debug as there are so many nested functions.

However, the dl_stats19 function either downloads the data or identifies the file in the local directories. Either way the directory of the data is known by the end of this function. So it maybe simplifies things if this just returns the full path of the one file that is being requested (only one year or type is allowed to be requested at any one time). This also negates many of the read functions.

Also it seems like 2020 is the only year that has two files for a specific type. It looks like the e-scooter dataset is unique to 2020. It is nice to have this data but I think it makes sense that if ask = FALSE then it defaults to dft-road-casualty-statistics-vehicle-2020.csv which is the same file as is available for other years after 2017.

Have submitted a pull request for these changes.

Robinlovelace commented 3 months ago

Thanks for flagging the issue and the PR. I have also noticed some weird behaviour, just have had limited time to look into them. So very grateful for the report and fix, will take a look and test ASAP.

layik commented 2 months ago

Having spent some time on this I think I know the issue here @BlaiseKelly. The core of the issue is that for that year and any other year where "escooter" data such that two files are shown like:

Multiple matches. Which do you want to download?

1: dft-road-casualty-statistics-vehicle-e-scooter-2020-Latest-Published-Year.csv
2: dft-road-casualty-statistics-vehicle-2020.csv

stats19 currently reads the first file. I am not sure if the package is even equipped to deal with e-scooter data. Sorry cannot send a fix yet..

Robinlovelace commented 2 months ago

Let's just remove the e-scooter files from the list for now is my thinking.

BlaiseKelly commented 2 months ago

I pushed this fix a few weeks ago https://github.com/BlaiseKelly/stats19/commit/bcf15615aae51c65bdfdaa284ec12ef6bab6e1a7 which works for me.

I am using this version with no problems, but it fails the automatic CRAN checks, i think maybe too many changes in one go.

Robinlovelace commented 2 months ago

Thanks Blaise, will look to merge ASAP. In fact will take a look at https://github.com/ropensci/stats19/pull/238 right now..

BlaiseKelly commented 2 months ago

I deleted the previous pull request and submitted another with only the code changes (and no qmd files!). Hopefully this is clearer.