ornladios / ADIOS

The old ADIOS 1.x code repository. Look for ADIOS2 for new repo
https://csmd.ornl.gov/adios
Other
54 stars 40 forks source link

Numpy: Slicing is missing the stepping #197

Open n01r opened 5 years ago

n01r commented 5 years ago

Dear all,

I noticed that when accessing an adios variable with [] for slicing in numpy style the stepping functionality is missing.

For numpy arrays you can access every Nth entry in a dimension by doing array[::N]. However, if I do the same with an adios variable (which I want to do before I load it into RAM and thus before it is converted to a numpy array) it seems that the following is happening:

adios_var[start:stop:step]
# seems to become the same as
adios_var[start:stop // step]

The usage of a third (step) variable is numpy-like but doesn't follow the numpy behaviour. Currently, neither documentation nor error message indicates a misuse.

n01r commented 5 years ago

cc'ing @PrometheusPi, @ax3l

ax3l commented 5 years ago

You really do not want to stride during reads anyway for performance. Read all into a temporary var and stride in RAM :)

wmles commented 5 years ago

Minimal "working" example, e.g. with the file from ax3l's issue #69

import adios as ad
f = ad.file("data_compressed.bp")
f["xy"][0, :6:2] == f["xy"][0, :3]

gives [True, True, True].

I think this should be considered a bug. The proper way of handling this should be raising a ValueError or a NotImplementedError (if it's possible that adios will support stepping at some time - which could be useful if you do not think about performance but the array does not fit into RAM - which was the case for Marco). The current implementation masks the problem, since the shape of the returned array is as expected, so in the worst case a user just doesn't notice that he got the wrong data!

A fix in python (the wrapper is not written in python, is it? I will post some pseudocode anyway^^) would be:

def __getitem__(self, key):
    if isinstance(key, slice) and slice.step is not None:
        raise NotImplementedError
    return super().__getitem__(key)
williamfgc commented 5 years ago

@n01r @ax3l @wmles we are currently supporting in ADIOS2 the Python low-level and high-level APIs. You can see then in action using Jupyter Notebooks running on MyBinder (see ReadMe): https://github.com/ornladios/ADIOS2-Jupyter

A few things about high-level APIs:

  1. We don't have slicing/stride numpy syntax, high-level APIs are write/read based and are selection/random access (np_array = read("var", start, count, step_start, step_count)) and/or step-by-step based (using for step in stream). As in the spirit of Python we heavily overload these 2 functions, but read always returns a numpy array. For examples see: https://github.com/ornladios/ADIOS2/blob/master/testing/adios2/bindings/python/TestBPWriteTypesHighLevelAPI.py
  2. We are maintaining these APIs and potentially adding more features. e.g. compression: https://github.com/ornladios/ADIOS2/blob/master/testing/adios2/bindings/python/TestBPZfpHighLevelAPI.py

Let me know if this is something of interest, ADIOS2 readers should be able to handle ADIOS1.x generated files. Thanks!