MDSplus / mdsplus

The MDSplus data management system
https://mdsplus.org/
Other
70 stars 43 forks source link

Add the ability to easily make a static snapshot of a signal #2715

Open mwinkel-dev opened 4 months ago

mwinkel-dev commented 4 months ago

Affiliation EPFL submitted by @mwinkel-dev of MIT PSFC on behalf of @merlea of EPFL (as per conversation on Discord)

Description It would be useful to be able to create static snapshots of signals. Either by providing example programs or by building that into the core libraries.

The following quote came from @merlea's post on Discord.

" . . . I am looking for a way to gather all data from a complex MDSplus object. E.g. if the object is a signal BUILD_SIGNAL(BUILD_PARAM(EXPR1, EXPR2, EXPR3),*,EXPR4,EXPR5) then I would like to have the same structure but with BUILD_SIGNAL(BUILD_PARAM(DATA1,DATA2,DATA3),*,DATA4,DATA5). I tried some of MdsSerialize, MdsCompress or MdsCopyDxXd but from what I could tell, these always keep the expressions and do not evaluate the data, but maybe I am doing something wrong or some other function should be used. . . . "

Example For example, the \IP tag in MIT's C-Mod trees involves evaluating 129 node references (to 28 unique nodes). Having a static snapshot of \IP (i.e., the result of evaluating the signal) would eliminate I/O and computational overhead, and also provide constant results (e.g., manually editing a calibration node wouldn't affect the snapshot). These are just conjectured uses for the snapshot capability. For the use case that prompted this feature request, contact @merlea.

Additional context @merlea tried many ways to create the snapshot and had no luck. @mwinkel-dev wrote a Python program (attached below) that creates a snapshot, but the timebase is wrong.

This is a surprisingly complicated task. Am hoping that other MDSplus core developers know of a much easier way to create a static snapshot of a signal.

mwinkel-dev commented 4 months ago

Here is the example Python program that works (but gets the time base wrong).

#
# Example Python program that creates a static snapshot of a signal.
# 
# Scenario:
#    - Tree A has a signal, SIG, that relies on data from many other nodes.
#      Obtaining A's data thus involves evaluation of an extensive
#      expression cascade, and therefore requires a lot of I/O and computation.
#      Furthermore, some of the nodes that SIG indirectly references can
#      change over time (e.g., a user manually edits a calibration value).
#    - Tree B is an isolated tree that will contain a static snapshot
#      of SIG, call it SIG_SNAPSHOT. The goal is to ensure that SIG_SNAPSHOT
#      has the value of evaluating SIG, but has no references to tree A.
#
# Design:
#    - Evaluate SIG and store snapshots of its constituent parts in B.
#    - Then write a "Build_Signal()" expression that references the nodes in B.
#    - Compile that expression and store it in SIG_SNAPSHOT.
# 
# Notes:
#    - The constituent parts of SIG could be stored in nodes anywhere in B.
#      However, this code places the nodes as members of SIG_SNAPSHOT.
#    - This example is incomplete; it just demonstrates the concept.
#    - Thus this example assumes the nodes have already been created in B.
#    - Production code would add nodes, have error checking and be more robust.
#    - The time base for SIG (i.e., the first MDSplus "dimension") is not
#      being evaluated correctly.  Perhaps because of problems accessing some 
#      MDSplus device nodes?
#    - Perhaps there are easier ways to create a snapshot of a signal?
# 

import MDSplus as mds

# Parmeters *1 are tree A in the above explanation, *2 are tree B
def snapshot(tree1, shot1, signal1, tree2, shot2, signal2):

   t1 = mds.Tree(tree1, shot1)
   n1 = t1.getNode(signal1)
   d1 = n1.record.getData()
   # TODO: add code to handle optional "raw" data
   # r1 = n1.record.raw_of()
   # TODO: Add code to handle multiple dimensions.  Also determine why
   # this is not correctly evaluating the time base.  Perhaps use the `n1.execute()` line?
   # Unfortunately, that line did not evaluate; it threw an exception.
   # dim0 = n1.execute(n1.dim_of(), tree=t1)
   dim0 = d1.getDimensionAt(0)
   u1 = n1.units_of()
   t1.close()

   # Save snapshots of SIG's constituent parts
   t2 = mds.Tree(tree2, shot2, "edit")
   n = t2.getNode(signal2 + '.snap_data')
   n.putData(d1)

   # TODO: add code to handle optional "raw" data
   # n = t2.getNode(signal2 + '.snap_raw')
   # n.putData(r1)   

   n = t2.getNode(signal2 + '.snap_dim0_a')
   n.putData(dim0[0])

   n = t2.getNode(signal2 + '.snap_dim0_b')
   n.putData(dim0[1])

   # TODO: third parameter in a dimension is optional
   # n = t2.getNode(signal2 + '.snap_dim0_c')
   # n.putData(dim0[2])

   # TODO: add code to handle additional dimensions

   n = t2.getNode(signal2 + '.snap_units')
   n.putData(u1)

   t2.write()

   # Finally, create the SIG_SNAPSHOT signal
   n = t2.getNode(signal2)
   expr = f'Build_Signal(Build_With_Units({signal2}.snap_data, {signal2}.snap_units), *, Build_Dim(*, {signal2}.snap_dim0_a : {signal2}.snap_dim0_b))'
   n.record = n.compile(expr, tree=t2)

   t2.write()
   t2.close()

# Tree A is a shot from MIT's archive for the C-Mod tokamak, and SIG is
# the plasma current (which indirectly references ~130 nodes).
snapshot('cmod', 1090909009, '\IP', 'tree_b', 100, 'SIG_SNAPSHOT')
mwinkel-dev commented 4 months ago

C-Mod's \IP signal looks like this (as displayed by jTraverser2). image

The above program generates this snapshot signal. image

It appears that the \IP signal has a variable time base (faster sampling during the flattop?). Note that the X axis is in seconds.

The snapshot signal is assuming a constant time interval between data points. And the X axis is surely the array index.

joshStillerman commented 4 months ago

How about:

def preserve_units(thing):
    import MDSplus
    if thing.__class__ == MDSplus.WithUnits:
        return MDSplus.WithUnits(thing.data(), thing.units.data())
    else:
        return thing.data()

def freeze_signal(thing):
    import MDSplus
    if thing.__class__ == MDSplus.Signal:
        descs = thing.descs
        ans = []
        for idx in range(len(descs)):
            if idx != 1:
                ans.append(preserve_units(descs[idx]))
        return MDSplus.Signal(ans[0], None, *ans[1:])
    else:
        return preserve_units(thing)

which when used gives:

$ python
Python 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from MDSplus import Tree
>>> from freeze_signal import freeze_signal
>>> t = Tree('cmod', 1090909009)
>>> s = freeze_signal(t._IP.record)
>>> s
Build_Signal(Build_With_Units(Set_Range(13312,0. /*** etc. ***/), "ampere"), *, Set_Range(13312,-1.9999 /*** etc. ***/))
>>> 
mwinkel-dev commented 4 months ago

Hi @joshStillerman -- Nice! I will give that a try.

joshStillerman commented 4 months ago

When you write it back you should do:

new_node.record = s
mwinkel-dev commented 4 months ago

Hi @joshStillerman,

I retrofitted my program to use your freeze_signal() function. And as expected, it now has the correct time base. The jTraverser2 plot of the snapshot matches that of the original \IP signal.

Glad to see that there is a much easier way to accomplish this task than what I cobbled together. Thanks!

mwinkel-dev commented 4 months ago

Hi @merlea,

I hope the above code examples help you with your task. If you have problems, let us know more about your use case, the APIs that you use and so forth.

mwinkel-dev commented 4 months ago

Using the freeze_signal() function allows the initial program posted above to be simplified.

import MDSplus as mds

# At this spot, define the two functions: preserve_units(thing) and freeze_signal(thing)

# Parmeters *1 are tree A in the above explanation, *2 are tree B
# ToDo: Production code should have error checking, create nodes, etc.
def snapshot(tree1, shot1, signal1, tree2, shot2, signal2):
   t1 = mds.Tree(tree1, shot1)
   n1 = t1.getNode(signal1)
   expr = freeze_signal(n1.record)
   # print(expr)
   t1.close()

   t2 = mds.Tree(tree2, shot2, "edit")
   n = t2.getNode(signal2)
   n.record = expr
   t2.write()
   t2.close()

# Tree A is a shot from MIT's archive for the C-Mod tokamak, and SIG is
# the plasma current (which indirectly references ~130 nodes).
snapshot('cmod', 1090909009, '\IP', 'tree_b', 100, 'SIG_SNAPSHOT')