AustralianAntarcticDivision / raadfiles

Data library management tools for files
https://australianantarcticdivision.github.io/raadfiles/
2 stars 1 forks source link

simplify data library set up #10

Closed mdsumner closed 5 years ago

mdsumner commented 6 years ago

Currently we have run all of these, in order:

raadfiles::set_raad_data_roots(my_data_dir)      
raadfiles:::run_this_function_to_build_raad_cache()
raadfiles:::set_raad_filenames()

The first should trigger invalidate of the others, though perhaps with opt-out / yesno allow of step 2.

raymondben commented 6 years ago

Could set_raad_data_roots() take a build_cache_if_empty parameter that defaults to TRUE? so that pointing to a new data directory automatically triggers its file list build. And also a refresh_cache that defaults to FALSE, so you have to opt in to that one if you're using a data dir that you've used previously. And in either case, yes, calling set_raad_data_roots should trigger set_raad_filenames

-- ignore this one, see https://github.com/AustralianAntarcticDivision/raadtools/pull/80/commits/cd2e12151273597ec4751a76c22d25a0ad8a5db5

raymondben commented 6 years ago

Random comments looking for a home:

mdsumner commented 6 years ago

I actually got in a fluff because I was running the job and updating the package, not a good idea but did make me wonder if raadfiles should never trigger these tasks on load, and only rely on other packages or an admin to actually run them. That's probably safer, because otherwise the code scheduled to be run depends on the load occurring successfully, and you can get stuck and then you need another worker. It could set an env flag prior to an automated task "no-onload-admin", and that's set for any scheduled task. I'm undecided

mdsumner commented 6 years ago

It seems ok atm, I'll just stay away from cron until I'm sure. :)

mdsumner commented 6 years ago

Pretty fine. We have to set triggers for mount, because they are lost when then the mothership reboots

mdsumner commented 6 years ago

Ok, now we have this for a first time set up, after running bowerbird in "C:/temp/data"

options(raadfiles.file.cache.disable = TRUE)  ## only really needed when raadfiles will find existing stuff
library(raadfiles)
set_raad_data_roots("C:/temp/data")
run_build_raad_cache()  
set_raad_filenames()
mdsumner commented 6 years ago

For subsequent usage, a session looks like

options(raadfiles.data.roots = "C:/temp/data")
library(raadfiles)

but note this will also find other roots using the internal heuristics if they are available - so watch out for that.

mdsumner commented 5 years ago

I do believe we achieved this