DOI-USGS / scipiper

https://usgs-r.github.io/scipiper/index.html
Creative Commons Zero v1.0 Universal
9 stars 12 forks source link

create another package-level combiner: "combine_to_tibble" or otherwise returns an object #140

Closed jordansread closed 4 years ago

jordansread commented 4 years ago

combine_to_ind(ind_file, ...) returns a file. But there are similar cases where you want to do the same thing but return an R object (e.g., a tibble with filepath and hash columns).

perhaps something like combine_to_tibble(...) (or _to_table to avoid the tibble/data.frame naming)?

I thought maybe this would be what coming_to_ind could do if you specify ind_file = NULL, but that wouldn't work with the current version of combiner functions, which don't have much flexibility for arguments.

aappling-usgs commented 4 years ago

Cool, Jeff suggested something similar this week that you may be referring to (https://github.com/jsadler2/ds-pipelines-3/pull/11#issuecomment-657057885).

I thought this would be extremely quick, but some silly decisions about how to treat 1 (and 0?) files in sc_indicate came to roost when I tried to write combine_to_tibble the simple way.

Example for some temp files: for multiple files, we get list names = filenames and list elements = hashes:

> sc_indicate('', data_file=tfiles[1:3])
$`/var/folders/_0/fbg0ffkj6z3fb_7jvm8wg86r002c5h/T//RtmpDIhLsH/139e03beea864`
[1] "873fabd21220be3cef08dd074dbd8f19"

$`/var/folders/_0/fbg0ffkj6z3fb_7jvm8wg86r002c5h/T//RtmpDIhLsH/239e031715bf9`
[1] "d41d8cd98f00b204e9800998ecf8427e"

$`/var/folders/_0/fbg0ffkj6z3fb_7jvm8wg86r002c5h/T//RtmpDIhLsH/339e094d51cb`
[1] "aa09a0b52ac9414d807e3a13275cdd3d"

but for just one file we get a list where the name is hash instead of the filename:

> sc_indicate('', data_file=tfiles[1])
$hash
[1] "873fabd21220be3cef08dd074dbd8f19"

Should we change the behavior of sc_indicate or write around it with combine_to_tibble (and also combine_to_ind?)?

aappling-usgs commented 4 years ago

I'm leaning toward working around because changing it would potentially cause a lot of unnecessary rebuilds in existing projects. It feels untidy, though.