Closed ethanwhite closed 5 years ago
If I remove the SQLite cache file and rerun I get the same end point with a little extra output first:
target maizuru_data
target jornada_data
target sgs_data
target portal_data
Target portal_data messages:
Loading in data version 1.97.0
target bbs_data
target sdl_data
Warning: target sdl_data warnings:
Didn't find any downloaded data in ~/veg-plots-sdl.
Did you run get_retriever_data() first?
fail sdl_data
Error: Target `sdl_data` failed. Call `diagnose(sdl_data)` for details. Error message:
no applicable method for 'select_' applied to an object of class "NULL"
Execution halted
Warning message:
system call failed: Cannot allocate memory
We use an environmental variable in MATSS to control where downloaded datasets go, and that same variable is used when reading them in. (and it defaults to ~
if it isn't set)
I've noted this issue in weecology/MATSS#106. After getting that functionality set up, we should only need to add a config file here to point the pipeline at analysis/data
.
Thanks. So if the install_retriever_data
calls were actually getting executed and folder_path
was being set then everything would have ended up in the right place and worked?
For now it sounds like I need to move the data directories into ~
, so I'll go ahead and try that.
OK, so now everything is working locally, but still not working on HiPerGator. Here's what I get on HiPerGator with the data in ~
:
(r-reticulate) [ethanwhite@dev1 MATSS-LDATS]$ Rscript analysis/pipeline.R
...
Error: Failed to make a grid of grouping variables for map().
Grouping variables in map() must have suitable lengths for coercion to a data frame.
Possibly uneven groupings detected in map(fun = list(ts), data = list(maizuru_data, jornada_data, sgs_data,
portal_data, bbs_data, sdl_data, mtquad_data), lda = list(
analysis_lda_maizuru_data, analysis_lda_jornada_data, analysis_lda_sgs_data,
analysis_lda_portal_data, analysis_lda_bbs_data, analysis_lda_sdl_data)):
ts
c("maizuru_data", "jornada_data", "sgs_data", "portal_data", "bbs_data", "sdl_data", "mtquad_data")
c("analysis_lda_maizuru_data", "analysis_lda_jornada_data", "analysis_lda_sgs_data", "analysis_lda_portal_data", "analysis_lda_bbs_data", "analysis_lda_sdl_data")
Execution halted
The error happens here:
https://github.com/weecology/MATSS-LDATS/blob/master/analysis/pipeline.R#L36
It looks like for some reason mtquad_data
isn't included in lda_targets
once it reaches build_ts_analysis_plan
which causes the map
to fail? But that dataset is in lda_targets
while in pipeline.R
so I'm confused and hoping this makes senses with a little more knowledge of the codebase.
Ah, I updated the package code and the pipeline script a few days ago, so I think you're running the slightly older version of the package with the newer version of the pipeline script.
Can you try re-installing MATSS-LDATS and running again?
That fixed it. Thanks!
We are now successfully running on HiPerGator. Next step is to do a scheduled build.
I installed both MATSS & MATSS-LDATS using devtools. The data installation steps appear to currently be comment out (they are wrapped in an
if (FALSE)
statement. So I did the installs manually into theanalysis/data
directory. The current state of that directory isand the appropriate csv files installed by the retriever are in each of the dataset named folders.
When I try to run the
pipeline.R
script this is what I get.