Open briochemc opened 4 years ago
I just ran a model with dust and nepheloid and hydrothermal sources (so at least 3 big 'loads' plus loading ao.mat) which took 0.5 seconds to set up the problem and 9 seconds to solve the matrix inversion. So even with all of that loading it's a relatively small total proportion of the time, right?
Unless I'm missing something and this is a much bigger time-sink than I have understood, I prefer to re-load the data in each function because it makes the functions easier to understand and modify for other uses.
You're right, I don't think it will matter for the current setup, because these are in fact rather "small" loads. But it's a good general practice to not compute or load the same things again and again! 😄
BTW, the changes I'm thinking of should be rather minimal and not obfuscate things at all. I think we could just replace the
load('data/AEROSOLDEP')
type of calls with something like
global AEROSOLDEP
isempty(AEROSOLDEP) && load('data/AEROSOLDEP')
@profseth and @hengdiliang
As I mentioned somewhere else, I think we should remove all the
load
calls from the functions, but I'm not sure what is the best approach, so we should discuss this here (if you even want to change it at all, of course)I propose two alternatives:
Carrying extra argument(s)
In each function to hold these items. If it was all up to me, I think I'd try something like turning all the
into
where
BGC
is just my renameddo
(becausedo
is not self-explanatory) and holds the usual parameters, e.g.,BGC.revscav.on
, and so on.OCIM
holds the transport circulation and the grid information, so basically it'sao
(but instead of loading it many times, we load it once and for all) andwater_transport
.DATA
holds the required external data, e.g., the Mahowald dust 2D map gridded to the OCIM. Again, having it passed as an argument means that it's not loaded every time the computation needs the dust source.DATA
could also hold things like other tracers, like stuff for the cost function, etc. Basically it would hold everything else thanOCIM
andBGC
.Use global variables.
OCIM circulation and grid, BGC flags, and external data don't change at all while we run OCIM models. Following the 1st solution above, we could just use
everywhere, and turn
DATA
,OCIM
, andBGC
into global variables, except for a few "optimizable" parameters that would live inp
. Then we could keep theload
statements in each of these functions, but have theload
statement only trigger if the data is not already available inDATA
,OCIM
, orBGC
. So something like that, for example:I think this would prevent redundant
load
calls, and could save a lot of time.FWIW, I prefer option 2 in theory, but I've been told to avoid globals, so there's that.
What do you guys think?