Open karlmsmith opened 6 years ago
Comment by @AnsleyManke on 3 Nov 2011 19:40 UTC Recording a side conversation here - points to possible directions for the future.
Roland comments, '' I wonder if it is necessary to burden the user with a "/remote" modifier? Maybe if Ferret determines it can be done remotely do it, if not start pulling over the data and do it locally. Maybe this is to allow the user to leave off the "/remote" and force it to be done by pulling the data over. /local ? ''
Steve's thoughts: ''Interesting to think of the time that distributing calculations around will become so routine that the default action should be to do so. I think for now, though, we need to have the server-siding of a calculation be a user-initiated action. Here are two use cases to support this: 1) many LET definitions do not result in a net reduction in data volume, so there is not necessarily a benefit to server-siding; and 2) in general we do not know what jobs are running on what computer for what purpose and how much load is on each computer relative to the computer's power, or potential billing algorithms -- again making it uncertain whether server-siding is a for-sure good idea. ''
Comment by @AnsleyManke on 20 Feb 2013 17:50 UTC Having mostly worked through the initial implementation of LET/REMOTE, here are a few questions to be resolved and notes for further development:
1) I have required LET/D with a remote definition, so that all remote variables are attached to a dataset, not just to the current default dataset. Is this what we want, or do we want to be able to open several remote datasets and have the LET/REMOTE definitions apply to whichever is the default at the moment.
If we require LET/D=xx/REMOTE, then we could check at the time of defining the variable whether the particular dataset accepts remote definitions. Otherwise, check at the time of loading data, and if it does not accept remote definitions, just continue as a local variable.
2) We have put the climatological axes definitions into Ferret so that remote variable operations can include climatological regridding. This points to the need for a bit of redesign of the way Ferret (running in the ftds server) writes the header.xml file. In my first iteration of this, it's writing information about all the climatological axes to the header, whether those axes are used in any of the variables in the dataset or not. The SHOW AXIS/ALL/XML needs some fine tuning.
3) F-TDS writes data in single precision, but should be changed to write data in double precision. See LAS ticket http://dunkel.pmel.noaa.gov/trac/las/ticket/1265
4) See ticket #2040 for ideas about getting the dependency of a variable on other variables. This will let us check that the remote dataset has the variables it needs to define a particular variable.
5) For the future, think of ways that the Ferret session could pass more commands to the F-TDS server, such as defining an axis for a ZAXREPLACE, or opening a second dataset on the F-TDS server and regridding our Remote variable to a variable on a grid from that second dataset.
Reported by @AnsleyManke on 3 Nov 2011 19:37 UTC For example, at GFDL, users open datasets on their analysis cluster. They would like to do analysis operations where the data resides rather than bringing it back to where the user is running Ferret. So for instance,
would do the analysis where the data lives and bring back to the desktop session only the result.
Proposal to implement this is to introduce a new variable-type, remote-variable or rvar. Places marked by ++ will need more filling-in, consulting Roland.
New variables:
in XVARIABLES COMMON
in XDSET_COMMON -INTEGER dset_accepts_remote(maxdsets) ! may be yes, no, or unknown
A. At "let" time (the LET/REMOTE command):
B. During "get_grid" phase:
C. At completion of "get_grid" phase:
D. At time of variable evaluation (is_algebra.F) 1.What we need to do is to intercept user variable interpretation before it starts, and to substitute a remote read operation, instead. The place to do this would be in INTERP_STACK at where IS_ALGEBRA is called to evaluate a user variable.
Other chores:
Migrated-From: http://dunkel.pmel.noaa.gov/trac/ferret/ticket/1891