ornladios / ADIOS

The old ADIOS 1.x code repository. Look for ADIOS2 for new repo
https://csmd.ornl.gov/adios
Other
54 stars 40 forks source link

Numpy: OOP Wrapper & h5py #56

Open ax3l opened 8 years ago

ax3l commented 8 years ago

@yyalli I am splitting this thread from issue #53 since I accidentally went OT there.

It would also be great if we could bring the general interface closer to h5py since this would make many scripts easily portable, accessing a huge community of people already using h5py and one can benefit from a very easy interface (that ADIOS can extend where necessary, e.g., for transport methods).

Some points:

Here is an example that allows to create objects for individual (sub-)groups and variables, making programming way easier for users:

import adios as ad

f = ad.file("adios_test.bp")
t = f["temperature"] # type: adios.var

# here t could have a member dictionary attr(s) again
# which looks up all attributes starting with t.name
# now match: f.attrs["/temperature/description"]
print( t.attrs["description"] )

# the same should be possible for groups
g = f["someSubGroup/anOtherGroup/"] # type: adios.group
# now match: f.attrs["/someSubGroup/anOtherGroup/anOtherAttribute"]
print( g.attrs["anOtherAttribute"] )
# now match: f["/someSubGroup/anOtherGroup/anOtherVariable"]
print( g["anOtherVariable"] )

I know that attributes are currently stored as a list and the full path is the identifier in bp, nevertheless the numpy API can just abstract and "visualize" these objects by simple string-matching.

I am writing this because I am urgently in the need to have our openPMD validator scripts for ADIOS, too. But to keep them maintainable it would be extremely helpful to have the ADIOS numpy wrapper in a clean OOP style such as the one from h5py (also on github).

We are also using the numpy wrapper in our project pyDive that is already struggling with missing OOP in the wrapper. We would really like to include the wrapper in openPMD-viewer with the LBNL and DESY folks but also here, more OOP as in the example above would help a lot.

When working with the python community we need to be aware that Python is not C, so interfaces need to be in object-centric, with object methods rather than function calls and well-documented via DocStrings (PEP8 PEP257). Python users work interactively via ipython and query DocStrings, e.g., with object? and object.method?. Python developers like abstracting things into modules, so changing an import h5py as io to import adios as io should be all that is necessary for a first serial reader and writer to become "ADIOS". Also, documentation can be auto-generated via Sphinx/ReadTheDocs and we could add examples/tests and if you like travis-ci tests during development. It might also be a good idea to split the wrappers from the main repo and include adios as a git submodule to keep the changing APIs and updates controllable.

Last but not least, HDF Compass could get a very simple backend implementation with ADIOS as soon as the wrapper is a bit more object oriented. This would automatically "equip" ADIOS with a GUI such as users know it from HDFView.

ax3l commented 8 years ago

Moved from https://github.com/ornladios/ADIOS/issues/53#issuecomment-164554174:

Hi, Axel,

Sorry for my late reply.

I partially implemented attaching attrs to variables. Recent examples can be found
in wrapper/numpy/tests/test_adios_attribute.py

However, I haven't implemented the group concept yet. Please try the example and
let me know if this update works for you or not.
ax3l commented 8 years ago

@yyalli I am sorry for the late answer: yes, the .attrs attribute on adios.var objects works great for me in your example.

Unfurtunately it does not work with the example I provided above for the file wrappers/numpy/tests/adios_test.bp:

import adios as ad

f = ad.file("adios_test.bp")
t = f["temperature"]

f.var.keys()
# ['NX',
# '/__adios__/timer_labels_1',
# '/__adios__/timers_1',
# 'temperature',
# 'size']

f.attrs.keys()
# ['/temperature/description']
t.attrs.keys()
# [] # should not be empty
jychoi-hpc commented 8 years ago

Leading '/' caused the error. Now added a routine to handle with and without the leading '/' character.

ax3l commented 8 years ago

yes exactly. Fantastic solution, thank you! :sparkles:

ax3l commented 8 years ago

Oh wait, the groups are still missing. Sorry for the clumsy issue.

jychoi-hpc commented 8 years ago

Sorry for the late update. I have added "group" feature. Can you check again?

ax3l commented 8 years ago

oh wonderful! I tested the reader with a not-to-complex example and it looks great!

The adios.writer seems to segfault on fileObject.declare_group("myGroup") right now.

# gdb

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff690800d in adios_common_select_method_by_group_id (priority=0, method=0x7ffff2e64744 "POSIX1", parameters=0x7ffff7f8452c "", group_id=14649920, base_path=0x7ffff6b76ef0 <__pyx_k__7> "", iters=0)
    at core/adios_internals_mxml.c:2516
2516                    && adios_transports [new_method->m].adios_init_fn

# backtrace
#0  0x00007ffff698800d in adios_common_select_method_by_group_id (priority=0, method=0x7ffff7eb03e4 "POSIX1", parameters=0x7ffff7f8452c "", group_id=13197984, base_path=0x7ffff6bf6ef0 <__pyx_k__7> "", iters=0)
    at core/adios_internals_mxml.c:2516
#1  0x00007ffff69604df in adios_select_method (group=13197984, method=0x7ffff7eb03e4 "POSIX1", parameters=0x7ffff7f8452c "", base_path=0x7ffff6bf6ef0 <__pyx_k__7> "") at core/adios.c:379
#2  0x00007ffff692db1b in __pyx_f_5adios_select_method (__pyx_skip_dispatch=0, __pyx_optional_args=<synthetic pointer>, __pyx_v_method=0x7ffff7eb03e4 "POSIX1", __pyx_v_group=<optimized out>) at adios.cpp:6281
#3  __pyx_pf_5adios_6writer_2declare_group (__pyx_v_method_params=<optimized out>, __pyx_v_method=0x7ffff6bf0fa1 <__pyx_k_POSIX1> "POSIX1", __pyx_v_gname=<optimized out>, __pyx_v_self=0x7ffff7eeabd0)
    at adios.cpp:20921
#4  __pyx_pw_5adios_6writer_3declare_group (__pyx_v_self=<adios.writer at remote 0x7ffff7eeabd0>, __pyx_args=<optimized out>, __pyx_kwds=<optimized out>) at adios.cpp:20830
#5  0x00000000004c9e05 in call_function (oparg=<optimized out>, pp_stack=<optimized out>) at ../Python/ceval.c:4033
#6  PyEval_EvalFrameEx () at ../Python/ceval.c:2679
#7  0x00000000004c87a1 in PyEval_EvalCodeEx () at ../Python/ceval.c:3265
#8  0x00000000005030ef in PyEval_EvalCode (
    locals={'ad': <module at remote 0x7ffff7eaac90>, 'f': <adios.writer at remote 0x7ffff7eeabd0>, '__builtins__': <module at remote 0x7ffff7f7eb08>, '__file__': 'createGroup.py', '__package__': None, '__name__': '__main__', '__doc__': None}, 
    globals={'ad': <module at remote 0x7ffff7eaac90>, 'f': <adios.writer at remote 0x7ffff7eeabd0>, '__builtins__': <module at remote 0x7ffff7f7eb08>, '__file__': 'createGroup.py', '__package__': None, '__name__': '__main__', '__doc__': None}, co=0x7ffff7eeaab0) at ../Python/ceval.c:667
#9  run_mod.lto_priv () at ../Python/pythonrun.c:1371
#10 0x00000000004f8c72 in PyRun_FileExFlags () at ../Python/pythonrun.c:1357
#11 0x00000000004f7d77 in PyRun_SimpleFileExFlags () at ../Python/pythonrun.c:949
#12 0x00000000004982f2 in Py_Main () at ../Modules/main.c:640
#13 0x00007ffff6f12b45 in __libc_start_main (main=0x497d80 <main>, argc=2, argv=0x7fffffffdfe8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffdfd8)
    at libc-start.c:287
#14 0x0000000000497ca0 in _start ()

Currently, newly created groups are not detected in the writer:

import adios as ad

f = ad.Writer("example.bp")

# segfaults:
# f.declare_group("group")

# create group implicitly
f.attrs["/group/testAttr"] = 3
g = f["/group"] # or "group" or "/group/"
# KeyError: '/group'

and fileObj.close() seems not to work:

import adios as ad

f = ad.Writer("example.bp")
f.close()
# adios.pyx in adios.writer.close (adios.cpp:21719)()
# TypeError: expected string or Unicode object, NoneType found
ax3l commented 8 years ago

There seem to be some problems with reading nested groups. Example:

$ bpls -a example.bp
  real                /data/0/fields/FieldE/x                       {10240, 8320}
  real                /data/0/fields/FieldE/y                       {10240, 8320}
[...]
  unsigned long long  /data/0/particles/e/particles_info            {40}
  unsigned long long  /data/0/particles/i/particles_info            {40}
  real                /data/0/particles/e/position/x                {5646994}
  real                /data/0/particles/e/position/y                {5646994}
  real                /data/0/particles/e/momentum/x                {5646994}
[...]
  integer             /data/0/particles/i/globalCellIdx/x           {5646994}
  integer             /data/0/particles/i/globalCellIdx/y           {5646994}
  double              /data/0/fields/FieldE/x/sim_unit              attr
  double              /data/0/fields/FieldE/y/sim_unit              attr
[...]
  unsigned integer    /data/0/iteration                             attr
  unsigned integer    /data/0/sim_slides                            attr
  real                /data/0/delta_t                               attr
  real                /data/0/cell_width                            attr
  real                /data/0/cell_height                           attr
[...]
import adios as ad

f = ad.file("example.bp")

# OK:
g = f["data/0"]

# all these fail:
g = f["data"]
g = f["/data"]
g = f["/data/"]
g = f["data/0/"]
g = f["/data/0"]
jychoi-hpc commented 8 years ago

I have fixed for the nested groups. Can you try again?

jychoi-hpc commented 8 years ago

I have also fixed the segfault error with declare_group and the NoneType error on closing.

ax3l commented 8 years ago

I have also fixed the segfault error with declare_group and the NoneType error on closing.

declare_group and close() work now, thank you!

I have fixed for the nested groups.

the "data/0" example above works now, thanks!

Can you try again?

The following example with the Writer class does still not work:

import adios as ad

f = ad.Writer("example.bp")

f.declare_group("group1")
g1 = f["group1"]
# KeyError: 'group1'

# create group implicitly
f.attrs["/group2/testAttr"] = 3
g2 = f["/group2"] # or "group2" or "/group2/"
# KeyError: '/group2'

# OK
a = f.attrs["/group2/testAttr"]
# fails
b = f.attrs["group2/testAttr"]
# KeyError: 'group2/testAttr'