Closed Aariq closed 1 year ago
Just to clarify, this means that every time get.trait.data()
is run it writes new posteriors to BETY. Here's an example of running runModule.get.trait.data()
twice in a row with setting$database$bety$write <- FALSE
:
The first run registers 'prior.distns.csv', 'prior.distns.Rdata', 'species.csv', 'trait.data.csv', 'trait.data.Rdata' under posterior IDs 9000001246 and 9000001247 (one for each PFT), and the second run registers under IDs 9000001248 and 9000001249
Actually, maybe the above comment is a separate bug? I'm not exactly sure what is supposed to happen here. Any insight @dlebauer?
Ok, tracked this down a bit more. runModule.get.trait.data()
is passing settings$meta.analysis$update
to the forceupdate
argument of get.trait.data()
. If settings$meta.analysis$update
is TRUE it will write to BETY, if anything else (e.g. "AUTO" or "FALSE") it will not. It does not check setting$database$bety$write
. Is this the correct behavior?
So I think you've hit on a bit of code that's given us trouble for a long time. In terms of desired behavior, the trait query and MA should NOT be running every time the workflow is run. The fact that it tends to has resulted in a massive overproliferation of Posteriors records, hugh numbers of which are virtually identical. In the early days of the project, when David had a whole team of folk populating the trait database it made more sense to update the posteriors more frequently, but at this point it should probably only occur when the user explicitly asks for an update (i.e. the default for forceupdate
should be FALSE). The AUTO mode, which aimed to only run the MA when the data has changed, never did this correctly and tended to always run.
Yeah, the get.trait.data()
doesn't do anything with "AUTO", it converts anything other than "TRUE" to FALSE
. But I'm confused about something---isn't the MA run by runModule.run.meta.analysis()
? Why is get.trait.data()
using the settings$meta.analysis$update
at all? Also get.trait.data.pft()
seems have code to print messages that indicate the MA is getting re-run, but I don't see where there is code in that function to actually run the meta analysis.
IMO a function called get_*
shouldn't write anything or do any analysis. None of this behavior is documented, which is partly why this is taking me so long to figure out.
The short-sighted fix is to give get.trait.data.pft()
a write
argument and have it inherit that from runModule.get.trait.data(settings)
.
A maybe better solution is to have read.settings()
store the write tag as an environment variable or an option and have all the relevant PEcAn.DB
functions check for that option / env variable before doing anything.
Bug Description
get.trait.data()
seems to write to BETY (not in documentation), but it doesn't seem to check forsettings$database$bety$write
.To Reproduce
run PEcAn workflow with
settings$database$bety$write <- FALSE
and check to see iffile.path(settings$database$dbfiles, "posterior", settings$pfts$pft$posteriorid)
existsExpected behavior
Nothing should be written to
settings$database$dbfiles
ifsettings$database$bety$write == FALSE