NOAA-PMEL / Ferret

The Ferret program from NOAA/PMEL
https://ferret.pmel.noaa.gov/Ferret/
The Unlicense
55 stars 20 forks source link

enable distributed analysis for a desktop Ferret user; LET/REMOTE #1163

Open karlmsmith opened 6 years ago

karlmsmith commented 6 years ago

Reported by @AnsleyManke on 3 Nov 2011 19:37 UTC For example, at GFDL, users open datasets on their analysis cluster. They would like to do analysis operations where the data resides rather than bringing it back to where the user is running Ferret. So for instance,

yes? use "http://server.dataset.nc"
yes? let/REMOTE Ek = u^2 + v^2
yes? let/REMOTE EkAveT = Ek]t=`t1:`t2`@ave]
yes? fill EkAvet[z=0]

would do the analysis where the data lives and bring back to the desktop session only the result.

Proposal to implement this is to introduce a new variable-type, remote-variable or rvar. Places marked by ++ will need more filling-in, consulting Roland.

New variables:

A. At "let" time (the LET/REMOTE command):

1.perform a normal LET.  Then IF ("/REMOTE") ...
    a.scan for "d=" anywhere in the definition. Issue error if so
    b.set uvar_remote(ivar) = .TRUE.

B. During "get_grid" phase:

1.just after the grid of any UVAR has been determined,
(where in code?  most likely just before "RETURN 2" in IS_UVAR_GRID, though could also be at "300" in get_uvar_grid)  
==>      IF (uvar_remote) THEN ...
  a.determine if this dataset accepts remote definitions
      1) is it remote?  is it F-TDS?  dset_accepts_remote(dset) should be made to record the suitability the dataset at this point. If not suitable, this is not an "error" per se. [++ Need a new function to determine this suitability] If no remote datasets set dset_accepts_remote(dset) to FALSE and return.  Else ...
         a) find a slot for a new remote variable def'n
         b) set rvar_uvar, rvar_dset for this dataset/variable pair
         c) set rvar_on_server(rvar) = .FALSE.

C. At completion of "get_grid" phase:

1.At completion of get_uvar_grid it is time to make sure that the remote server is informed of the LET definitions that it is expected to perform (there can be more than one variable)
  a.scan the rvar list until there is no rvar for which rvar_on_server is .FALSE.
      1)if rvar_on_server is .FALSE. for a given rvar THEN scan the remaining rvar list and ...
         a)find all rvars sharing a given dataset
         b)construct the URL syntax for F-TDS to define the variables [++Need details of constructing this syntax]
         c)nc_close this dataset
         d)nc_opn the dataset with the F-TDS definitions
         e)if F-TDS alters the order of variables, then re-query the netCDF var_ids from their names and set into cd_varid (SEE QUESTION 3 AT START) 
         f)set rvar_on_server = .TRUE. for these rvars
         g)query the varid of this variable name within F-TDS and set into rvar_varid
         h)with Roland ask: is a query of the varID a reasonably sufficient test that remote server understands the definitions? ++

D. At time of variable evaluation (is_algebra.F) 1.What we need to do is to intercept user variable interpretation before it starts, and to substitute a remote read operation, instead. The place to do this would be in INTERP_STACK at where IS_ALGEBRA is called to evaluate a user variable.

* ... reduce algebraic expression to its components ?
        IF ( cat .EQ. cat_user_var      ) THEN
           CALL IS_ALGEBRA( memory, *10, *2000, isp_base, status )
           GOTO 5000

needs to become something like

* ... reduce algebraic expression to its components ?
        IF ( cat .EQ. cat_user_var      ) THEN
           IF (testForLocalUvar) THEN
                  CALL IS_ALGEBRA( memory, *10, *2000, isp_base, status )   ! local evaluation
                  GOTO 5000
                       ELSE
                                   CALL IS_READ_REMOTE

IS_READ_REMOTE can be patterned after IS_READ, with major changes.  It takes three steps:  i) set up the current context for a read (cx_category = cat_file_var) ; ii) perform the read;  iii) when done, clean up any clues that the result was once a file variable.   The messy part may be in "ii", since xdset_info lacks information about this remote file variable.  I would look into a "cuckoo's egg" strategy of temporarily creating an xdset_info entry at index maxvars, (e.g. setting cd_varid(maxvars) from rvar_varid) so that CD_READ is unaware it is reading a remote variable. 

Note that IS_READ_REMOTE would also be the place to issue a diagnostic statement about remote evaluation, when MODE DIAG is on

Other chores:

SHOW DATASET - should show remote variables ... and should also show if a dataset allows remote definitions (e.g. is F-TDS)
(Note that if get_grid has not yet been called in evaluating the remote uvars in this dataset, then they will not exist yet as rvars.  That's not ideal, but it is OK.  It is not a safe assumption that uvar_remote and dset_accepts_remote being TRUE implies that a variable should be shown in a remote dataset.  The definition may require variables found in a different dataset..
SHOW VAR, CANC VAR, CANC DATASET -- routine expected changes
initializing and cleaning the new rvar variables where needed (init_uvar, purge_uvars, ...)

Migrated-From: http://dunkel.pmel.noaa.gov/trac/ferret/ticket/1891

karlmsmith commented 6 years ago

Comment by @AnsleyManke on 3 Nov 2011 19:40 UTC Recording a side conversation here - points to possible directions for the future.

Roland comments, '' I wonder if it is necessary to burden the user with a "/remote" modifier? Maybe if Ferret determines it can be done remotely do it, if not start pulling over the data and do it locally. Maybe this is to allow the user to leave off the "/remote" and force it to be done by pulling the data over. /local ? ''

Steve's thoughts: ''Interesting to think of the time that distributing calculations around will become so routine that the default action should be to do so. I think for now, though, we need to have the server-siding of a calculation be a user-initiated action. Here are two use cases to support this: 1) many LET definitions do not result in a net reduction in data volume, so there is not necessarily a benefit to server-siding; and 2) in general we do not know what jobs are running on what computer for what purpose and how much load is on each computer relative to the computer's power, or potential billing algorithms -- again making it uncertain whether server-siding is a for-sure good idea. ''

karlmsmith commented 6 years ago

Comment by @AnsleyManke on 20 Feb 2013 17:50 UTC Having mostly worked through the initial implementation of LET/REMOTE, here are a few questions to be resolved and notes for further development:

1) I have required LET/D with a remote definition, so that all remote variables are attached to a dataset, not just to the current default dataset. Is this what we want, or do we want to be able to open several remote datasets and have the LET/REMOTE definitions apply to whichever is the default at the moment.

If we require LET/D=xx/REMOTE, then we could check at the time of defining the variable whether the particular dataset accepts remote definitions. Otherwise, check at the time of loading data, and if it does not accept remote definitions, just continue as a local variable.

2) We have put the climatological axes definitions into Ferret so that remote variable operations can include climatological regridding. This points to the need for a bit of redesign of the way Ferret (running in the ftds server) writes the header.xml file. In my first iteration of this, it's writing information about all the climatological axes to the header, whether those axes are used in any of the variables in the dataset or not. The SHOW AXIS/ALL/XML needs some fine tuning.

3) F-TDS writes data in single precision, but should be changed to write data in double precision. See LAS ticket http://dunkel.pmel.noaa.gov/trac/las/ticket/1265

4) See ticket #2040 for ideas about getting the dependency of a variable on other variables. This will let us check that the remote dataset has the variables it needs to define a particular variable.

5) For the future, think of ways that the Ferret session could pass more commands to the F-TDS server, such as defining an axis for a ZAXREPLACE, or opening a second dataset on the F-TDS server and regridding our Remote variable to a variable on a grid from that second dataset.