Open karlmsmith opened 6 years ago
Comment by @AnsleyManke on 18 May 2017 00:32 UTC Ferret has always split up many sorts of calculations in directions other than the one we're doing the transformation on. The new feature is that it can now split along the of the transformed axis. Here's an example which in older versions of Ferret it wold split the calculation on the Y axis, doing the time average for all-T on subsets in Y and then putting the result back together.
In v7.12, it'll split in T which should be more efficient, splitting in the "slowest" direction of the grid. If we were loading data from different time-step files it would mean opening each file just once instead of multiple times.
Here's a case where the output of SET MODE DIAGNOSTIC is pretty clear. Try this with a newer and older version.
set mode diagnostic
set mem/siz=10
use a.nc
load a[l=@ave]
Comment by steven.c.hankin on 18 May 2017 16:21 UTC Ansley nailed the principle here. If you offer the new Ferret memory management enough room to perform the calculation by splitting along a slower axis, it will utilize that option. The algorithms do not strive for minimum memory usage; they strive to complete the requested operation in the most efficient manner possible.
If you want to experiment with this, try the same command repeatedly with diagnostic mode on, offering successively smaller memory.
Comment by @AndrewWittenberg on 18 May 2017 16:51 UTC I see -- so for my previous example, it was entirely an improvement in efficiency, not capacity.
Then an example of improved capacity would be if we were to transform along all axes. So while the following failed in earlier versions:
NOAA/PMEL TMAP
FERRET v6.9
Linux 2.6.32-431.5.1.el6.x86_64 64-bit - 04/03/14
18-May-17 12:41
yes? set mem/size=10
Cached data cleared from memory
yes? use a.nc
yes? load a[x=@ave,y=@ave,t=@ave]
**ERROR: request exceeds memory setting: 100 Mwords were requested.
*** NOTE: You can use SET MEMORY/SIZE=xxx to increase memory.
*** NOTE: The "Memory use" section of the FERRET Users Guide has further tips.
it succeeds in the new version:
NOAA/PMEL TMAP
FERRET v7.12 (optimized)
Linux 2.6.32-696.1.1.el6.x86_64 64-bit - 05/16/17
18-May-17 12:41
yes? set mem/size=10
yes? use a.nc
yes? load a[x=@ave,y=@ave,t=@ave]
Note that we can't do the following though:
NOAA/PMEL TMAP
FERRET v7.12 (optimized)
Linux 2.6.32-696.1.1.el6.x86_64 64-bit - 05/16/17
18-May-17 12:44
yes? set mem/size=10
yes? let a = x[gx=1:1000:1] + y[gy=1:1000:1] + t[gt=1:100:1]
yes? load a[x=@ave,y=@ave,t=@ave]
**ERROR: request exceeds memory setting
To fulfill this request would exceed the current SET MEMORY/SIZE= limit of 10 megawords
At the moment that the memory limit was reached
memory was committed as follows:
- to objects used in computation: : 1000100 (10%)
The size of the requested object was: : 100000000 (1000%)
Comment by steven.c.hankin on 18 May 2017 17:19 UTC With SET MEMORY/SIZE=xxx we are telling Ferret "xxx is the amount of memory that we'd like you to use", rather than "use as little memory as possible, but under no circumstance exceed xxx"
I'm going to step back from the discussion a little now, because my time is running short and I'm trying to finish something up, but on the example below I'd add this bit of information/insight:
I recall that in the algorithm of SETUP_GATHER i only carried its optimization logic out to two axes -- not three. I think what I did in that situation was to choose the longest two axes ... but I may be remembering wrong; maybe it was the slowest two axes of length gt 1. In any case, the third axis is not ignored -- 3- axis splitting can still occur -- it's just that I didn't try to find the optimal strategy in a 3-axis analysis. It just got too complicated. But it could be done ...
I'm reluctant to be giving you off-the-cuff, hurried answers, because I know you are a very careful thinker ... I may be missing your point. But here goes anyway ... In your final example -- where the failure occurs for "a" as a LET definition, there is no way for Ferret to succeed given its internal machinery. The definition of "a" is a product of 3 fixed-length arrays -- 1000x1000x100 - = 100 Megawords. It looks to me like this particular "a" cannot be created except by having it fully instantiated. I think (until you prove me wrong ) that if you defined an actual T axis of the same length, the memory management might be able to kick in:
define axis/t=1:10000:1 tax ! doesn't matter how long
let a = x[gx=1:1000:1] + y[gy=1:1000:1] + t[gt=tax]
load a[l=1:100@ave]
In this form the XY footprint is fixed (1000x1000), but the T indexing is free for Ferret to play with.
Comment by @AndrewWittenberg on 18 May 2017 17:44 UTC (No need to respond if you're busy -- this isn't urgent at all.)
Thanks Steve, that was on the right track. It works if ALL of the axes are fixed axes:
NOAA/PMEL TMAP
FERRET v7.12 (optimized)
Linux 2.6.32-696.1.1.el6.x86_64 64-bit - 05/16/17
18-May-17 13:25
yes? set mem/size=10
yes? define axis/x=1:1000:1 xax
yes? define axis/y=1:1000:1 yax
yes? define axis/t=1:100:1 tax
yes? let a = x[gx=xax] + y[gy=yax] + t[gt=tax]
yes? load a[x=@ave,y=@ave,t=@ave]
Earlier Ferret versions run out of memory for that example.
Note that it doesn't work if any of the axes are dynamic, as in your original suggestion:
NOAA/PMEL TMAP
FERRET v7.12 (optimized)
Linux 2.6.32-696.1.1.el6.x86_64 64-bit - 05/16/17
18-May-17 13:38
yes? set mem/size=10
yes? define axis/x=1:1000:1 xax
yes? define axis/t=1:100:1 tax
yes? let a = x[gx=xax] + y[gy=1:1000:1] + t[gt=tax]
yes? load a[x=@ave,y=@ave,t=@ave]
**ERROR: request exceeds memory setting
To fulfill this request would exceed the current SET MEMORY/SIZE= limit of 10 megawords
At the moment that the memory limit was reached
memory was committed as follows:
- to objects used in computation: : 1000100 (10%)
The size of the requested object was: : 100000000 (1000%)
Reported by @AndrewWittenberg on 18 May 2017 00:06 UTC I'm having trouble with the example on page 4 of the dynamic memory management notes at:
https://docs.google.com/document/d/1DiFRaxGYJ2p-HgKYn8w9jNOAj0E9Jdj6wZjjVihTZSw
First create a file containing 1e8 (double precision) numbers:
Then compute a time average:
Why is the cache 7 Mw, when the result should be only 2 Mw (1e6 double-precision numbers)?
And in an older version of Ferret (say, 6.9) without the dynamic splitting, I get:
I thought the old version would've tried to load the whole 200 Mw of data into the 10 Mw of memory, to compute the 2 Mw result. So why was no error issued there?
Migrated-From: http://dunkel.pmel.noaa.gov/trac/ferret/ticket/2535