PERMUTE(VAR, OLD_DIMS, NEW_DIMS)

Reported by @AndrewWittenberg on 20 Dec 2012 23:37 UTC It would be nice to have a function to permute coordinates and all their attributes, to consolidate and extend the TRANSPOSE_*() functions. The idea is the same as those functions, but PERMUTE would allow the user to permute multiple dimensions in one call.

There may be precedent for such a function in the existing Ferret code, since it would behave much like USE/ORDER.

Proposed syntax:

   PERMUTE(VAR,OLD_DIMS,NEW_DIMS)

where OLD_DIMS is an IJKLMN/XYZTEF string indicating the axes to be permuted, and NEW_DIMS is another such string indicating the target reordering.

   PERMUTE(a,'XY','EF')   !X->E, Y->F
   PERMUTE(a,'XYZ','ZXY') !X->Z, Y->X, Z->Y
   PERMUTE(a,'XY','XY')   !does nothing

! all of the following would be equivalent to TRANSPOSE_XY:

   PERMUTE(a,'XY','YX')
   PERMUTE(a,'XYZ','YXZ')
   PERMUTE(a,'ZYX','ZXY')
   PERMUTE(a,'KJX','ZIJ')

The following rules would apply to OLD_DIMS and NEW_DIMS:

contain only the letters IJKLMN/XYZTEF
no repetition of axes within a string (X/I being equivalent, etc.)
OLD_DIMS and NEW_DIMS have the same string length

Those rules seem sufficient. The following would then apply to an XYZ-shaped variable "A":

   PERMUTE(a,'TE','ET')   !does nothing

Slightly fancier would be to also accept "-" signs to reverse coordinates:

   PERMUTE(a,'X','-X')      !same as XREVERSE(a)
   PERMUTE(a,'XYZ','-ZX-Y') !X->reversed Z, Y->X, Z->reversed Y

It'd be great if axis attributes (titles, units) could be preserved by PERMUTE().

Such a PERMUTE function would greatly extend the functionality of Ferret's existing functions and transformations.

Migrated-From: http://dunkel.pmel.noaa.gov/trac/ferret/ticket/2016

Comment by @AnsleyManke on 21 Dec 2012 17:32 UTC Just a comment or two.

USE/ORDER actually makes use of a NetCDF call to read the data in permuted order (nc_get_varm_double, etc.) so this function would be new Ferret code.

The TRANSPOSE functions return data on abstract axes in the transposed dimensions. If axes are to inherit attributes, we need to think how that would go. If it were done wholesale, then a permutation in XY to EF would result in an E axis with attributes such as

                Eaxis:units = "degrees_east" ;
                Eaxis:modulo = " " ;
                Eaxis:point_spacing = "even" ;
                Eaxis:axis = "X" ;
                Eaxis:standard_name = "longitude" ;

At the least we'd change the axis attribute.

Of course any result axis can be redefined in a separate step, but to make the function convenient we can certainly think of the right combination of things to inherit.

Comment by @AndrewWittenberg on 27 Dec 2012 20:42 UTC I'd suggest transferring all the attributes, with the sole exception of "axis". PERMUTE would very often be called twice: once to move a dimension into a position where it can be transformed (e.g. by ZAXREPLACE), and once more to move the dimension back into its prior position. Having the attributes follow the dimensions would save some work in rebuilding the attributes for the permuted dimensions.

An alternative would be to return everything on abstract axes, as the TRANSPOSE functions do, but then allow the user to later copy dimension attributes in a separate step using /LIKE qualifiers (see Trac #1694).

Comment by steven.c.hankin on 3 Jan 2013 23:04 UTC The ordering of internal data arrays as X-Y-Z-T-E-F -- e.g. the fact that the time axis is the 4th axis of the array -- is a fundamental part of Ferret's DNA. It would be a very major code revision (probably with significant associated performance penalties) if the permutation of arrays had to be considered as a part of internal computation. (Note that USE/ORDER is implemented as a part of the IO. Only at the IO level of Ferret is there any awareness that a permutation has been performed.)

To move forward with this ticket I suggest I suggest that we look at use cases: what goals are we trying to achieve when we permute arrays? Then we can think about the range of alternative approaches that might achieve these goals at lower effort.

Comment by @AndrewWittenberg on 4 Jan 2013 00:14 UTC PERMUTE() would just be a composition of calls to TRANSPOSE_() and REVERSE(); the only difference is that it would preserve axis attributes. If the attribute-preservation is impossible to implement, then we can drop that requirement.

That is,

   let b = PERMUTE(a,"XYZ","-ZX-Y")

would be identical to

   let b = ZREVERSE(YREVERSE(TRANSPOSE_XY(TRANSPOSE_XZ(a))))
   set axis/like=`a,r=zaxis` `b,r=xaxis`
   set axis/like=`a,r=xaxis` `b,r=yaxis`
   set axis/like=`a,r=yaxis` `b,r=zaxis`

if only we had SET AXIS/LIKE (see Trac request #1694).

I had mentioned one use case above, namely extending the utility of functions that are only defined for particular axes, like ZAXREPLACE, SAMPLEIJ, SAMPLEXYT, SCAT2GRID_BIN_XY, and many others. Similarly, PERMUTE() could extend the utility of .jnl files (like wavelet transforms and kernel density estimation) that had been hard-coded to work on a particular axis.

Regarding time axes, it's common to have two or more apply to a single dataset. E.g. with the new 6-d Ferret, we can use T for forecast lead time and F for forecast initialization time. E.g. given a TF-shaped variable V, V[F=1-jan-1980] would be the time series of the single forecast initialized on 1-jan-1980, while V[T=12] would be the concatenated time series of 12month-lead forecasts.

Comment by steven.c.hankin on 4 Jan 2013 00:53 UTC I'll give some more thought to this tomorrow. The following quick thoughts on the use cases that you provided at the end to keep the conversation lively for now.

''I had mentioned one use case above, namely extending the utility of functions that are only defined for particular axes, like ZAXREPLACE, SAMPLEIJ, SAMPLEXYT, SCAT2GRID_BIN_XY, and many others. Similarly, PERMUTE() could extend the utility of .jnl files (like wavelet transforms and kernel density estimation) that had been hard-coded to work on a particular axis.''

Here's an alternative approach I'd like to toss into the ring: The examples above only need to be extended, because in their current forms the functions narrowly restrict the geometry of the result in their function names. For example, "ZAXREPLACE" can only replace a Z axis. Suppose that we made a slight extension to the "grid-changing function" framework by allowing one of the arguments to be designated as "dominant" with respect to determining the grid of the result. That would allow us to create a function like

AXREPLACE(v, zvals, zax)

where it replaced the Z axis because the third argument, "zax", was dominant. Equally valid would be

AXREPLACE(v, tvals, tax)

to replace the T axis.

2. ''Regarding time axes, it's common to have two or more apply to a single dataset. E.g. with the new 6-d Ferret, we can use T for forecast lead time and F for forecast initialization time. E.g. given a TF-shaped variable V, V[F=1-jan-1980] would be the time series of the single forecast initialized on 1-jan-1980, while V[T=12] would be the concatenated time series of 12month-lead forecasts.''

This sounds like the bullseye intended use of the F axis. I'm not sure, though, that I see it as a problem of "permutation", so much as a problem of "aggregation" -- i.e. you have a number of 4D model runs that were started at different initialization times, and you want to aggregate them along a 5th dimension that is the initialization time. Is this right? Do you tune into the Unidata/THREDDS email discussions? Our discussion sounds like the same concept that community knows as the "Forecast Model Run Collection Aggregation" (http://www.unidata.ucar.edu/software/netcdf/ncml/v2.2/FmrcAggregation.html) (Scroll down to the colored plot that illustrates the 5D dataset -- a Ferret plot, as it happens).

Comment by steven.c.hankin on 4 Jan 2013 18:15 UTC P.S.: Implementing a PERMUTE function is a smaller task than the strategies I suggested last night. If this is a high priority issue we should consider implementing a PERMUTE function soon, and schedule more ambitious changes to follow. (The idea of a "grid-dominant" argument in a grid-changing external function has arisen independently in planning to improve the ease of access to in situ data collections from Ferret.)

Comment by @AndrewWittenberg on 15 Jan 2013 01:17 UTC Thanks Steve for your comments.

1) I agree that a more general AXREPLACE(), with a new concept of "dominant" axes for internal/external function coding, would be helpful in the long term. For this application and other like it, PERMUTE() would provide a stopgap solution. PERMUTE(), in combination with RESHAPE(), could also facilitate elegant solutions to some complex multi-dimensional algorithms like regridding.

2) I wasn't aware of those discussions; it's good to know they're happening. The Figure 1 you mentioned is one way to represent such data; I was actually thinking of an alternative that we often use, where the reds would all be aligned along the horizontal, so that we'd have "forecast lead time" (rather than "forecast date") as the ordinate, giving a rectangular block instead of a parallelogram. In some ways the rectangle is easier to use for retrospective forecast evaluation, since we can then easily compute a lead-dependent climatology (by averaging along the abscissa), and departures from that climatology (i.e., bias-corrected anomaly forecasts). We can also plot the time series corresponding to a horizontal slice through that rectangle: the time series of purples would be the lead-1 forecasts, which should closely track the obs, and the time series of reds would be the lead-10 forecasts, which would have a lower correlation with obs.

In any case, my remark about the two time axes didn't have much to do with the topic of this thread. It was just an aside to your comment:3 that the time axis as the 4th axis is fundamental to Ferret.

Comment by @AndrewWittenberg on 19 Jan 2013 00:22 UTC As a workaround, here's a Ferret script to build the composition I mentioned above. The permuted axes become ABSTRACT axes, and the axis attributes are not preserved.

\can mode verify
! Usage: go def_permute in_var old_dims new_dims reverse [out_varname]
!                         $1      $2       $3       $4   [    $5     ]
! The script permutes the OLD_DIMS of IN_VAR into the NEW_DIMS of OUT_VAR,
! optionally reversing the data along certain dimensions.  The DIMS are
! specified as strings that are a list of axes (XYZTEF).
!
! For example,
!    go def_permute sst XYT TXY 010 sstp
! defines a variable SSTP with X->T, Y->X (reversed), and T->Y.
!
! Examples:
! yes? use coads_climatology
! yes? show grid sst
! yes? go def_permute sst XYT TXY 010; sho var permute
!
! yes? let a  = x[gx=1:2:1]+y[gy=1:3:1]+z[gz=1:4:1]+t[gt=1:5:1]
! yes? go def_permute a XYZT TXYZ 1100 b; sho var b
!
! NOTE: as of Ferret v6.84 the TRANSPOSE_*{E,F} functions do not exist, so we
! cannot permute axes E & F.
!
! atw 2013jan18

def sym dpt_str $1
let dpt_old_dims = "($2)"
let dpt_new_dims = "($3)"
let dpt_revs = "($4)"

let dpt_n = strlen(dpt_old_dims)

! ensure that dimension strings are the same length
IF `strlen(dpt_new_dims) NE dpt_n` THEN
   say "ERROR in def_permute.jnl: strlen(old_dims) NE strlen(new_dims)"
   exit/prompt
ELIF `strlen(dpt_revs) NE dpt_n` THEN
   say "ERROR in def_permute.jnl: strlen(old_dims) NE strlen(reverse)"
   exit/prompt
ENDIF

let dpt_old_dims_sp = spawn("echo `dpt_old_dims` | sed -e 's/./& /g'")
let dpt_new_dims_sp = spawn("echo `dpt_new_dims` | sed -e 's/./& /g'")
let dpt_swaps = spawn("permute_swaps.csh '`dpt_old_dims_sp`' '`dpt_new_dims_sp`'")

rep/name=dpt_idx/range=1:`dpt_n` (\
   let dpt_rev = substring(dpt_revs,`dpt_idx`,1);\
   let dpt_rev_ax = substring(dpt_old_dims,`dpt_idx`,1);\
   IF `dpt_rev` THEN;\
      def sym dpt_str `dpt_rev_ax`reverse(($dpt_str));\
   ENDIF)

rep/name=dpt_idx/range=1:`dpt_swaps,r=isize` (\
   let dpt_swap = dpt_swaps[i=`dpt_idx`];\
   def sym dpt_str transpose_`dpt_swap`(($dpt_str)))

let $5"permute" = ($dpt_str)

can sym dpt_str
can var dpt_old_dims dpt_new_dims dpt_revs
can var dpt_n dpt_old_dims_sp dpt_new_dims_sp dpt_swaps dpt_rev dpt_rev_ax dpt_swap

set mode/last verify

And here's a shell script that is used in the above Ferret script, to list the swaps (TRANSPOSEs) needed to achieve the permutation:

#!/bin/tcsh -f
#
# permute_swaps.csh
#
# atw 18jan2013

set name = `basename $0`
set usage = "$name old_list new_list"

# ============= START ARGUMENT PROCESSING =============
if ($#argv == 0) then
   cat <<EOF
NAME
   $name - echo the element-swaps needed to permute a list

SYNOPSIS
   $usage

DESCRIPTION
   Displays the ordered list of element-swaps of OLD_LIST that will permute it into
   NEW_LIST.  The OLD_LIST serves as its own list of indices.

   For example, to permute "X Y Z" into "Z X Y", we can first swap the data at the XZ
   positions, and then swap the data at the YZ positions:

   SWAP     RESULT
            X Y Z
     XZ     Z Y X
     YZ     Z X Y

   And to permute "a b c d e f g h i j" into "i c a g j d f b e h", we perform the
   following swaps (with the top row providing the indices):

   SWAP     RESULT
            a b c d e f g h i j
     ai     i b c d e f g h a j
     bc     i c b d e f g h a j
     ci     i c a d e f g h b j
     dg     i c a g e f d h b j
     ej     i c a g j f d h b e
     fg     i c a g j d f h b e
     hi     i c a g j d f b h e
     ij     i c a g j d f b e h

OPTIONS
   old_list  Quoted source list of words.

   new_list  Quoted target list of permuted words.

EXAMPLES
   $name "X Y Z" "Z X Y" | tr "\n" " "
   XZ YZ

   $name "x y z t e f" "e x z f y t" | tr "\n" " "
   xe ye tf

   $name "a b c d e f g h i j" "i c a g j d f b e h" | tr "\n" " "
   ai bc ci dg ej fg hi ij

AUTHOR
   Andrew Wittenberg
EOF
   exit 1
endif

if ($#argv != 2) then
   echo "ERROR in ${name}: require exactly two quoted arguments."; exit 1
endif

set old = ($1)
set new = ($2)

set old_sort = (`echo $old | tr " " "\n" | sort`)
set new_sort = (`echo $new | tr " " "\n" | sort`)
if ("$old_sort" != "$new_sort") then
   echo "ERROR in ${name}: lists contain different elements"; exit 1
endif

set repeats = (`echo $old_sort| tr " " "\n" | uniq -d`)
if ($#repeats) then
   echo "ERROR in ${name}: lists contain repeated elements"; exit 1
endif

@ n = $#old
@ i = 1
while ($i < $n)
   @ j = $i + 1
      while ($j < $n)
         if ($new[$j] == $old[$i]) then
            set new[$j] = $new[$i]
         endif
         @ j++
      end
   @ i++
end

@ i = 1
while ($i < $n)
   # display only self-to-other swaps
   if ($old[$i] != $new[$i]) then
      echo $old[$i]$new[$i]
   endif
   @ i++
end

exit 0

NOAA-PMEL / Ferret

PERMUTE(VAR, OLD_DIMS, NEW_DIMS) #1288