IQSS / dataverse-client-r

R Client for Dataverse Repositories
https://iqss.github.io/dataverse-client-r
61 stars 24 forks source link

Add ability to download zip of multiple files #47

Closed adam3smith closed 4 years ago

adam3smith commented 4 years ago

Please ensure the following before submitting a PR:

This PR does three things:

  1. Fixes a regression from https://github.com/IQSS/dataverse-client-r/commit/5ec375b21d5f9c5bb884f611d5aeb89c702da37b that broke the ability to specify a vector of fileids to download
  2. Documents and test said functionality
  3. ~Adds an additional parameter unzip to get_file() to give users the option to download individual files (the current behavior) or a single .zip file (my preferred behavior).~ (see below)

I'll add more tests if you're OK with the general approach here, but wanted to check first. I'd also need access to the dataverse on demo to properly test (all tests using the demo DV currently fail as I assume you know)

wibeasley commented 4 years ago

This is cool. For now, do you mind submitting a PR that fixes the regression, but leaves out the zip function until we get some consensus around #48?

adam3smith commented 4 years ago

I've reverted the zip feature as requested and will chime in on #48

In writing tests, I noticed that the current function, which was written before zip downloads included folder hierarchies, doesn't work for datasets with folders. Given that @pdurbin also suggests in #46 that there are downsides to requesting the zip and unpacking it, I've re-written the logic to download the files sequentially instead, not using the zip functionality of the API at all. That passes all tests and seems to perform faster even with small datasets.

If this looks good, I can clean up a bit more, but I think the tests are pretty good as they are.

pdurbin commented 4 years ago

I've re-written the logic to download the files sequentially instead, not using the zip functionality of the API at all.

That's what I ultimately recommended at https://github.com/jupyter/repo2docker/pull/739#discussion_r325340692 😄

adam3smith commented 4 years ago

(I realize there's a problem with the test, will fix this after we've agreed on the rest of this)

adam3smith commented 4 years ago

@wibeasley ready for review here. Would be nice to merge soon so that we can start working on the larger issues in #48

wibeasley commented 4 years ago

Thanks, @adam3smith!

pdurbin commented 4 years ago

@adam3smith I was just having beers tonight with @mfenner at PIDapalooza and we both think it's awesome that you're contributing.

And I said we probably wouldn't have you as a contributor if this package didn't have a maintainer. So thank you @wibeasley!