Python API to the plasma-state

dlg0 commented 9 years ago

More to come ...

elwasif commented 9 years ago

There MAY be an easier path. Long time ago, the guys at Tech-X were interested in coding up a C++ interface top the PS. Alex Pletzer (I think) actually did this, and Doug incorporated the code into the Python script that generates the plasma state. The main idea is to put a simple F77 wrapper around all PS calls, and then put a C and C++ interface around those F77 wrappers .

The code is in the old ORNL swim repo (which now lives in https://cswim.org/svn/cswim/ips/branches/combined-code) in state/src/ccps_lib and state/src/ccps_test. The code in ccps_lib is augmented greatly when the state is actually built (see it in its full glory on Edison in /project/projectdirs/m876/elwasif/trunk/components/state/src/ccps_lib )

Now I managed to build this coder and the associated test code on Edison (I had to fill in some gaps for some missing header files, so it may not be the 100% correct). When I run the test I get some weird output that suggests some MPI functionality is being invoked and I'm not really sure if the test is succeeding or not.

Soooooooo, if we convince ourselves that this C/C++ interface actually works against the current incarnation of the PS (it was done over 5 years ago from some of the dates in the source files), it should be relatively straightforward to build a Python wrapper around it (using swig http://www.swig.org/). For that we will need to code up some meaningful tests that exercise the routines we care about. We may have to deal with some issues related to the use of static libraries as the basis for a Python module, but a quick google search suggests this may not be totally insane.

parkjm commented 9 years ago

ccps_lib is a part of public release of PS. Not sure how often components/state is updated. It might be good idea to start with http://w3.pppl.gov/ntcc/PlasmaState/. @dlg0 already built most kind of stuffs including NTCC, ...

elwasif commented 9 years ago

@parkjm Indeed. I've just rebuilt the PS version from PPPL on edison (I needed to add -fpic flag for Python's sake), and the included ccps_lib builds fine. I plan on using this version to build the Python module.

elwasif commented 9 years ago

The (very alpha) PS python module, based on the ntcc plasma state code is in /project/projectdirs/atom/users/elwasif/plasma_state_build/pyplasmastate on edison. This one uses the Intel build of the NTCC code, but that shouldn't matter (I think).

I can import the module and create a new instance of the PS from Python. Data movement in/out of the underlying Fortran code "may" need more work, especially if we want this to work seamlessly with Numpy (the C++ PS interface uses its own multiarray class).

import pyplasmastate g = pyplasmastate.PlasmaState("Test", 1) dir(g) ['class', 'del', 'delattr', 'dict', 'doc', 'format', 'getattr', 'getattribute', 'hash', 'init', 'module', 'new', 'reduce', 'reduce_ex', 'repr', 'setattr', 'sizeof', 'str', 'subclasshook', 'swig_destroy', 'swig_getmethods', 'swig_setmethods', 'weakref', 'alloc', 'deriveMhdEq', 'finishSpecies', 'getCharSize', 'getDims', 'getHandle', 'getState', 'getVersionId', 'hashDiff', 'interp1d', 'interp2d', 'readMachDescr', 'readShotConfig', 'setData', 'setFusionSpecies', 'setImpuritySpecies', 'setNeutralBeamSpecies', 'setRFMinoritySpecies', 'setThermalSpecies', 'store', 'this', 'updateEquilibrium', 'updateHashCodes', 'verifyMhdEq', 'writeGeqdsk', 'writeMachDescr', 'writeShotConfig', 'writeUpdateFile']

parkjm commented 9 years ago

@elwasif

Great!!!

Could you please make PS module access the plasma state variables (access to ps%ns, ...)?

The component helpers generally look like (for example https://github.com/ORNL-Fusion/ips-fastran/blob/master/src/zgenray.py )

Input helper: ps = pyplasmastate.PlasmaState(ps_filename, 1) zeff = ps["Zeff"] % Copy array ps%Zeff to zeff Ti = ps["Ti"] % ... calculate something using zeff, Ti ... needed for the code input files

Output helper: Ti = ... %extract data from code output ps["Ti"] = Ti % put Ti to plasma state ps.store()

Then perfect, I think!!

elwasif commented 9 years ago

@parkjm Right now, almost everything there is pretty much auto-generated by SWIG. I'm still exploring what can be done using this starting point, and then see how best to modify it and/or build something on top to make it more intuitive.

parkjm commented 9 years ago

@elwasif, hope to accelerate this task.... I have a visitor student from S. Koera, who is working on making ASTRA component, especially for KSTAR. If the python PS API is available, I believe he can make very rapid progress.

elwasif commented 9 years ago

@parkjm Well, I think I can get something in there (Alpha/Beta quality) for get/set functionality. The interface may not be exactly what you have in mind, but we will be able to change that incrementally. Having access to the other routines (interpolation, ..etc) is a different story, but I suspect you're not using them anyway.

parkjm commented 9 years ago

@elwasif Thanks. Yes, we can change the interface incrementally. I'm not a big fan of the PS interpolation API, but many of IPS components in the SWIM repo rely on the interpolation API. I think the interpolation API is one of the key elements. Otherwise, I'm actually using most of the routines in PS (through calling a general utility fortran code -- sure in ugly fashion).

elwasif commented 9 years ago

@bernhold @batchelordb The initial version of the Plasma state Python API is now on Edison in /project/projectdirs/atom/users/elwasif/plasma_state_build/pyplasmastate

Some notes

This is a direct wrapping of the C++ interface defined in the ccps_lib thatis part of the standard PS distribution
This provides a subset of the functionality available in the PS F90 module
This uses it's own multi dimensional array class, and a vector class that wraps the C++ standard template library vector.
Numpy arrays are not currently supported (so you cannot pass a Numpy array into the PS to set a variable, or get a variable back as a Numpy array)
The file teststate.py has a direct translation of the C++ test code in ccps_test/ccps_test.cxx code and it runs (there's some dicrepancy in the standard output, and we will need more testing to make sure the functionality is corect).
The interface is currently (naturally) very C++'ish, we'll probably want to build something more Pythonic on top of this one.
One will need to add /project/projectdirs/atom/users/elwasif/plasma_state_build/pyplasmastate to LD_LIBRARY_PATH and PYTHONPATH for this to work
The list of supported methods is (not all of them tested in the teststate.py file) 'alloc', 'deriveMhdEq', 'finishSpecies','getCharSize', 'getDataDouble', 'getDataDoubleVec', 'getDataInt','getDataIntVec','getDataString','getDataStringVec','getDims', 'getHandle','getState','getVersionId','hashDiff','interp1d','interp2d','readMachDescr','readShotConfig', 'setData', 'setFusionSpecies','setImpuritySpecies','setNeutralBeamSpecies','setRFMinoritySpecies', 'setThermalSpecies','store', 'this', 'updateEquilibrium', 'updateHashCodes', 'verifyMhdEq', 'writeGeqdsk', 'writeMachDescr', 'writeShotConfig', 'writeUpdateFile'

parkjm commented 9 years ago

Cool!! Could you please add getitem and setitem to alias getData() and setData()?

elwasif commented 9 years ago

@parkjm My next step is to try and add a dictionary-like interface on top of this, but I'd prefer if this layer was better tested before we slap another layer on top of it - especially using the multiarray and vector classes to make sure we have correct functionality.

parkjm commented 9 years ago

@elwasif I was able to replace a fortran code pstool by pstool.py using pyplasmastate. The really great thing is that now we can handle plasma state inside of the component wrapper!! The getData and setData are not too bad but having more intuitive data access like ps["Zeff"] will be super great. Thanks.

elwasif commented 9 years ago

@parkjm I'm working on it. I think I have it working for scalars and one-D vectors, and now i'm looking into higher dimension data and a couple of other calls to the PS

dlg0 commented 9 years ago

@parkjm Is the Python PS API base pstool replacement at the point where I can use it? I'm trying to alter the ips-fastran/src/zgenray.py to read the abridged species list rather than the full one, but it seems I'd need to look at the pstool source to do it. When possible, perhaps you could make the pstool replacement available?

parkjm commented 9 years ago

@dlg0 Better to wait a little bit more until @elwasif release the first version of pyplasmastate.py. Then you can do it in minutes :) At that point, I'm going to throw away pstool.f90. (BTW, you can find pstool.f90 in /project/projectdirs/atom/users/parkjm/utils/pstool.)

jcandy commented 9 years ago

Gents. In the FACETS project they were critical of the "native" interface to plasma state data although the file format itself they liked just fine. I think this is an important issue that we may want to discuss as a group soon.

batchelordb commented 9 years ago

What was the criticism of the "native" interface? Was it just that it was in Fortran and not C++?

jcandy commented 9 years ago

:-)

Its been a while but I think it was an issue of portability and/or difficulty to build.

batchelordb commented 9 years ago

That is a legitimate criticism

elwasif commented 9 years ago

A new version of the PS Python interface is now in place. This version is built on top the earlier module and presents a more Pythonic interface (Dictionary-like access, support for native 1-D Python vector-like data strctures, and support for Numpy arrays). The code is on Edison in /project/projectdirs/atom/users/elwasif/plasma_state_build/pyplasmastate/plasmastate.py and a port of the teststate.py to the new interface is in newtest.py .

Some comments

Calls to set a Plasma state 1-D vector can take a list, tuple, or a Numpy 1-D array
Calls to set a PS variable with higher dimension takes only Numy arrays
Calls to get a PS variable ALWAYS return a Numpy array
I couldn't find a test that exercises the 2D or higher dimension PS variables, so any help there would be appreciated.

parkjm commented 9 years ago

:+1:

Observations:

for string, __setitem__ works, while __getitem__ does not
multl-dimensional variable need to be transposed:

For example ps["ts"][0] ==> 0th thermal species temperature ps["ts"][1] ==> 1sth thermal species temperature In the current implementation, ps["ts"].transpose()[0] gives 0th thermal species temperature

In the current implementation, variable access as lower case. We may better choose case-insensitive or case-sensitive.

elwasif commented 9 years ago

for string, setitem works, while getitem does not

This was actually a bug in the PPPL PS C++ code. It's fixed now (which begs another question, how do we contribute this bug fix and/or the whole Python interface back to PPPL?)

multl-dimensional variable need to be transposed:

I think I've fixed this now

In the current implementation, variable access as lower case. We may better choose case-insensitive or case-sensitive.

So the lower case is hard-wired in the f77 code, we can make the Python layer case insensitive

parkjm commented 9 years ago

Still multi-dimensional variable needs to be transposed.
ps["rho"] = numpy.linspace(0.0,1.0,nrho) raises message rho is not a Plasma State scalar

@dlg0 you have access to TRANSP repo. How do you report any bug? I guess correction in TRANSP repo populates to the NTCC library?

elwasif commented 9 years ago

@parkjm

Still multi-dimensional variable needs to be transposed.

Can you give me a simple example to use for testing

ps["rho"] = numpy.linspace(0.0,1.0,nrho) raises message rho is not a Plasma State scalar

I think you're using an older version of the module, this was an informational/debug message to keep track of variable lookup in the internal dictionaries, and it has been suppressed in the latest version. It should still work (even if you get this message). I wonder if you're also not getting the latest fix for the transpose issue

parkjm commented 9 years ago

O.K, no more diagnostic output. Fort test of muil-dimensional variable, print ps["Ts"][0] in your newtest.py that should output an array with length of 101.

parkjm commented 9 years ago

@elwasif So the routine below

from plasmastate import  *
from numpy import *

ps = PlasmaState("test", 1)
ps["nrho"] = 5
ps.alloc()
ps["rho"] = linspace(0.0,1.0,ps["nrho"])
ps["nspec_th"] = 2
ps.setThermalSpecies(-1, -1, 0)
ps.setThermalSpecies(1, 1, 2)
ps.finishSpecies()

print 'Access'
print ps["Ts"]
print ps["Ts"][0]

print 'Set all'
ps["Ts"] = array([[2.,2.,2.,2.],[1.,1.,1.,1.]])
print ps["Ts"]

print 'Set one'
ps["Ts"][0] = array([3.,3.,3.,3.])
print ps["Ts"]

is supposed to generate screen output

Access
[[ 0.  0.  0.  0.]
 [ 0.  0.  0.  0.]]
[ 0.  0.  0.  0.]
Set all
[[ 2.  2.  2.  2.]
 [ 1.  1.  1.  1.]]
Set one
[[ 3.  3.  3.  3.]
 [ 1.  1.  1.  1.]]

The output is, however,

Access
[[ 0.  0.]
 [ 0.  0.]
 [ 0.  0.]
 [ 0.  0.]]
[ 0.  0.]
Set all
[[ 2.  2.]
 [ 2.  2.]
 [ 2.  2.]
 [ 1.  1.]]
Set one
Traceback (most recent call last):
  File "check.py", line 23, in <module>
    ps["Ts"][0] = array([3.,3.,3.,3.])
ValueError: could not broadcast input array from shape (4) into shape (2)

elwasif commented 9 years ago

So I may need some help figuring out what's going on here (maybe @batchelordb knows). So the C++ interface to get() any high dimension (>= 2) arrays returns a vector of the whole array content, which will need to be reshaped according to a call to ps.getDim(XYZ). The call to getDim() simply returns the result of calling the F90 SHAPE internisic on the array. The Python module uses this information to build the Numpy array handed back to the caller.

When I look at the cdicrf entry in the PS file cur_state.cdf I find the following in the output from ncdump

   double cdicrf(dim_nicrf_src, dm1_nrho_icrf) ;
            cdicrf:long_name = "ICRF current drive" ;
            cdicrf:units = "A" ;
            cdicrf:component = "IC" ;
            cdicrf:section = "STATE_PROFILES" ;
            cdicrf:specification = "R|units=A|step*dA   cdicrf(~nrho_icrf,nicrf_src)" ;

When I load cur_state.cdf using the low level Python interface in pyplasmastate

n [19]: from pyplasmastate import *

In [20]: cdicrf = ps.getDataDoubleVec("cdicrf")

In [21]: from pyplasmastate import *                                          

In [22]: ps = PlasmaState("test", 1)                                        

In [23]: ps.getState("cur_state.cdf")                                          

In [24]: ps.getDims("cdicrf") 
Out[24]: (60L, 2L) 

In [25]: ps.getDataInt("nrho_icrf")                                                                                                       
Out[25]: 61

In [26]: ps.getDataInt("nicrf_src")                                                                                                 
Out[26]: 2

So it looks like (at least in this case) the PS stores the transpose of the variable as defined in the netcdf file. I can assume this is always the case and simply transpose the arrays going in/out of the PS to match what the Netcdf file says, but I'm not totally sure this is the right thing to do.

parkjm commented 9 years ago

The representation of dimension ordering for the netcdf API itself differs between fortran and C (IDL = fortran ordering, python = C ordering, if I remember correctly). In other word, when you call get_dim type of API, you will get (60, 2) or (2, 60) depending on which language API of netcdf you are using. So, we just need to transpose the array, since it's clear that the dimension ordering is fortran-like.

elwasif commented 9 years ago

So the set/get functionality seems to work now. It required reversing the dim vector going in/out of the PS and using C-ordering. The test @parkjm posted above works (except for this one

print 'Set one'
ps["Ts"][0] = array([3.,3.,3.,3.])
print ps["Ts"]

since accessing ps["t0"] gets you a copy of the PS entry, and changing that copy doesn't propagate back to the original PS data.

I would still like more tests to ensure that the in-memory F90 data is stored properly (and if /when it's used by the F90 numerical code beyond simple get/set that I'm not messing up the data layout.)

parkjm commented 9 years ago

@elwasif

Do you envision that 'Set one' type operation is supported eventually? Otherwise, updating electron temperature Te in helper code will look like

Te = ....
Ts = ps["Ts"]
Ts[0] = Te
ps["Ts"] = Ts

rather than

Te = ....
ps["Ts"][0] = Te

, less readable, though it's a new great tool as it is.

elwasif commented 9 years ago

@parkjm Getting this to work (I think) will require some serious code to explicitly capture all (full or partial) changes to all PS data and propagate them as needed to the underlying data structure. it maybe possible to do that by using a subclass of numpy.array that is aware of the PS, but I'll have to think more about that.

parkjm commented 9 years ago

@elwasif Thanks! I believe now it's time to throw away my pstool.f90. Where are you going to make the python plasma state repo? Under ips-atom/?

elwasif commented 9 years ago

@parkjm Probably under ips-atom. Will have to discuss with @dlg0 and package it and put some build system around it.

parkjm commented 9 years ago

@elwasif perfect!

dlg0 commented 9 years ago

@elwasif @parkjm

So yeah, under ips-atom sounds great.

As for the bug you found in the C++ part of the plasma state code, we probably want to have that committed to the PPPL svn repo. I'll see if I have write access and get back to you.

Is what you've done dependent in any way on the code-generated source in the plasma-state code base? If not, perfect. If so, that may require some further thought.

elwasif commented 9 years ago

@dlg0 Well, it depends on the code in sense of making calls to hard wired function names that are autogenerated from the PS specification. This is done via the C++ layer which is itself (partly) autogenerated too.

dlg0 commented 9 years ago

@elwasif But so you don't envision having to change your py-plasma-state module code if someone adds a variable to the plasma state?

elwasif commented 9 years ago

@dlg0 Yes, that new variable will have to be explicitly added to the list of variable the Python module knows about. I assume that the PS code gen will generate the necessary functions for this to be accessible from the C++ layer (and consequently from Python too)

dlg0 commented 9 years ago

OK, excellent.

elwasif commented 9 years ago

@parkjm So I've implemented a simple caching mechanism that should solve the issue with this code

print 'Set one'
ps["Ts"][0] = array([3.,3.,3.,3.])
print ps["Ts"]

The idea is to keep an internal cached reference to the object returned to the calling module. This local reference is then used whenever _getitem__ is called again on the same variable, guaranteeing that the object being accessed is the same object that was returned in the first call to getitem. This cache is flushed to the underlying PS before any method (other than getitem and setitem) is called to make sure the underlying data in the F90 layer is up-to-date. Calls to setitem deletes the corresponding item from the cache (it it exists) to make sure future get() calls doesn't retreive stale values.

I've tested this a bit and it seems to work, I may have not thought of all corner cases, especially when doing complex manipulations in the calling module, but I think it should ."just work".

parkjm commented 9 years ago

:+1: :+1:

Let me replace zplamastate.py and pstool.f90 by the new python API and run a typical fastran modeling that may be good (enough?) verification test. Thanks alot!.

dlg0 commented 9 years ago

@elwasif Thanks for all the work on this, it is much appreciated.

But of course I'm going to ask for more ;) Is it possible to provide access to the plasma-state interpolation routines as well?

Only asking because I've just run across this issue of density data being provided as cell, rather than node values, with the end points being specified separately; meaning we have to hack up some interpolation that is apparently done correctly by the API.

@batchelordb

elwasif commented 9 years ago

@dlg0 Thou ask and Thou shall get :-) interp1d() and interp2d are already there


interp1d(self, nm, deriv, xs, index = None)
'''
 * Interpolate 1d state multi-element profile 
 * @param nm name of multi-data
 * @param deriv 0 for value, 1 for d/dx, 2 for d^2/dx^2
 * @param xs target grid points
 * @return interpolated values
 * @param index ID index in the multi-element profile
'''

interp2d(self, nm, deriv1, deriv12, x1s, x2s, index = None)
'''
* @param nm name of multi-data
 * @param deriv1 0 for value, 1 for d/dx1, 2 for d^2/d[x1]^2
 * @param deriv2 0 for value, 1 for d/dx2, 2 for d^2/d[x2]^2
 * @param x1s target grid points
 * @param x2s target grid points (same size vector as above)
 * @param index ID index in the multi-element profile
 * @return interpolated values
'''

Not tested much (read at all) naturally. If what you need is something else, then it's probably not in the C++ wrapper, and may need to be ported from the F90 implementation in the PS F90 module

dlg0 commented 9 years ago

@elwasif I'm about to try using this. Is there a release I can install in the public area yet, or should I just work from your dev version at the moment?

Thanks, David.

elwasif commented 9 years ago

No release yet. Just use the onNo

From: David L Green notifications@github.com Date: May 29, 2015 at 10:11:06 AM EDT To: ORNL-Fusion/ips-atom ips-atom@noreply.github.com Cc: Elwasif, Wael R. elwasifwr@ornl.gov Subject: Re: [ips-atom] Python API to the plasma-state (#2)

@elwasifhttps://github.com/elwasif I'm about to try using this. Is there a release I can install in the public area yet, or should I just work from your dev version at the moment?

Thanks, David.

Reply to this email directly or view it on GitHubhttps://github.com/ORNL-Fusion/ips-atom/issues/2#issuecomment-106818941.

elwasif commented 9 years ago

No release yet. Just use the development version. Note that it is built against Intel Luba and may require Intel modules

WE

From: David L Green notifications@github.com Date: May 29, 2015 at 10:11:06 AM EDT To: ORNL-Fusion/ips-atom ips-atom@noreply.github.com Cc: Elwasif, Wael R. elwasifwr@ornl.gov Subject: Re: [ips-atom] Python API to the plasma-state (#2)

@elwasifhttps://github.com/elwasif I'm about to try using this. Is there a release I can install in the public area yet, or should I just work from your dev version at the moment?

Thanks, David.

Reply to this email directly or view it on GitHubhttps://github.com/ORNL-Fusion/ips-atom/issues/2#issuecomment-106818941.

dlg0 commented 9 years ago

I'm trying to get the interp1d() to work. Below is how ... any help here would be great.

On Edison ...

cd /project/projectdirs/atom/users/greendl1/diem_tsc_jm_pstest
source env.ips.edison
cd work/nb__nubeam_6/
python testps.py
  ps_init_tag: Plasma State v2.041 f90 module initialization.
[ 0.    0.02  0.04  0.06  0.08  0.1   0.12  0.14  0.16  0.18  0.2   0.22
  0.24  0.26  0.28  0.3   0.32  0.34  0.36  0.38  0.4   0.42  0.44  0.46
  0.48  0.5   0.52  0.54  0.56  0.58  0.6   0.62  0.64  0.66  0.68  0.7
  0.72  0.74  0.76  0.78  0.8   0.82  0.84  0.86  0.88  0.9   0.92  0.94
  0.96  0.98  1.  ]
[[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
   0.  0.]]
PlasmaState::get: ERROR at line 330 invalid nm=id_nbeami
terminate called after throwing an instance of 'PlasmaStateError'
Aborted

the contents of testps.py are ...

#!/usr/bin/env python

from plasmastate import *

ps = PlasmaState("ips",1)
ps.read("ips-state.nc")
rho = ps["rho"]
print rho
nbeami = ps["nbeami"]
print nbeami
nbeami_at_rho = ps.interp1d("nbeami",0,rho)
print nbeam_at_rho

elwasif commented 9 years ago

Hmmmm, I'm not sure what's going on, but I wonder if the error comes from the code in the C++ layer that prepends the string "id_" to the variable name before looking it up (this is in intel-hopper/ccps_lib/cxxps_getset.cpp

std::string idName = std::string("id_") + nm;
int id = this->getData<int>(idName);

I thought this is a built-in PS "feature", but maybe it's a bug???

elwasif commented 9 years ago

I'm rebuilding now using nm instead of idName in the call to getData. This pattern seems to exist only in the interp* routines, so maybe it IS a bug.

ORNL-Fusion / ips-wrappers

Python API to the plasma-state #2