ropensci / piggyback

:package: for using large(r) data files on GitHub
https://docs.ropensci.org/piggyback
GNU General Public License v3.0
182 stars 26 forks source link

String replacement of file path separator in `pb_upload()` #34

Closed kevinykuo closed 5 years ago

kevinykuo commented 5 years ago

Currently if we upload data/foobar.rds we get data.2ffoobar.rds which may throw people off (esp. if one isn't familiar with URL encoding to get the %2f -> .2f.) Should we consider just using the file name or using a different separator e.g. _?

kevinykuo commented 5 years ago

Actually, seems like the current scheme enables pb_download("data/foobar.rds") to work. I guess the end user isn't expected to look at the releases page to see what files are there since we'll provide download instructions in the workflow. Closing...

cboettig commented 5 years ago

right, _ is probably common in user's own filenames, so we couldn't round trip. Of course we're banking on the assumption that the pattern.2f doesn't appear in filenames, so it's not ideal.

Arguably there's a better solution than the current hack. e.g. one thing I think we could do is automatically tar/zip archive each file, separately, though you'd still need to ensure a unique filename of the archive. One could use a hash the path it contained, (or of the file contents, depending on if you wanted the link to be stable wrt location or wrt object contents). Though obviously that makes the name even more opaque.

(Of course users could zip up their whole archive themselves, but then you lose the ability to download just specific files.)