Closed stevenrbrandt closed 1 year ago
Hey @stevenrbrandt,
do I understand you correctly that you intend not to load the entire dataset with load_chunk()
, but just a small part of it?
For this, load_chunk
takes two optional parameters, the offset
and the extent
of the specific block to load, e.g. data = uu[f"{thorn}_{gf_name}"].load_chunk([64,64], [32,32])
for loading a block of size [32, 32]
starting at [64, 64]
.
The examples in our documentation might also help, in there you can find a description of how to use Python slices as an alternative to load_chunk()
:
chunk_data = E_x[1:3, 1:3, 1:2]
# print("Queued the loading of a single chunk from disk, "
# "ready to execute")
series.flush()
For more detailed questions on mesh refinement, I will have to defer to @ax3l
@franzpoeschel thanks, that gets me halfway there. Now how do I find the portion of the dataset that's actual data? I mean, I can see it after it's graphed, but I'd like to get it programmatically.
Do I assume correctly that you are working with .bp
files? (They are from the ADIOS2 backend which is the only one to properly support sparsely written datasets).
If yes, then E_x.available_chunks()
should give you the chunks that were actually written.
@ax3l Is this correct? Is that the only way to retrieve the written chunks? I don't see anything further in https://github.com/openPMD/openPMD-standard/pull/252/files.
Caution: I don't know which version of the openPMD-api you are using (you can find out via openpmd-ls --version
). If you use 0.15, then everything is fine, before that, there was a bug that might make E_x.available_chunks()
return no results, depending on your workflow.
Okay @franzpoeschel and @ax3l , I have put together a small tarball with a bp5 data file, a simple python3 script and a cmd.sh to run the script. It generates a png file. For some reason, openpmd-api thinks that the domain of the problem is 0-40 in x and y, when it's really 10-30 or so. I'm trying to figure out if (a) I'm doing something wrong (b) there's some functionality missing in the python version or (c) the file is generated incorrectly... or something else. https://cct.lsu.edu/~sbrandt/prob.tgz
@stevenrbrandt I am not able to run this script which you use for data processing.
Traceback (most recent call last):
File "/home/franz/Downloads/prob/plot-one.py", line 49, in <module>
vmin = np.min(data[data_index])
NameError: name 'data' is not defined
However I notice that the variable chunks
defined as chunks = data_raw.available_chunks()
is used nowhere in the script, but instead you still load the entire datasets by data = data_raw.load_chunk()
, including undefined regions.
The chunks seem to be properly available in the dataset that you uploaded:
> bpls -D wave2dx.it00000004.bp5/
double /data/4/meshes/carpetx_regrid_error_lev00/carpetx_regrid_error {2, 83, 83}
step 0:
block 0: [0:0, 1:80, 1:80]
double /data/4/meshes/carpetx_regrid_error_lev01/carpetx_regrid_error {3, 163, 163}
step 0:
block 0: [0:1, 33:128, 33:128]
double /data/4/meshes/wavetoynrpy_anagf_lev00/wavetoynrpy_anagf {2, 83, 83}
step 0:
block 0: [0:1, 1:81, 1:80]
double /data/4/meshes/wavetoynrpy_anagf_lev01/wavetoynrpy_anagf {3, 163, 163}
step 0:
block 0: [0:2, 33:129, 33:128]
double /data/4/meshes/wavetoynrpy_regrid_errorgf_lev00/wavetoynrpy_regrid_errorgf {2, 83, 83}
step 0:
block 0: [0:1, 1:81, 1:81]
double /data/4/meshes/wavetoynrpy_regrid_errorgf_lev01/wavetoynrpy_regrid_errorgf {3, 163, 163}
step 0:
block 0: [0:2, 33:129, 33:129]
double /data/4/meshes/wavetoynrpy_rhs_uugf_lev00/wavetoynrpy_rhs_uugf {2, 83, 83}
step 0:
block 0: [0:1, 1:81, 1:80]
double /data/4/meshes/wavetoynrpy_rhs_uugf_lev01/wavetoynrpy_rhs_uugf {3, 163, 163}
step 0:
block 0: [0:2, 33:129, 33:128]
double /data/4/meshes/wavetoynrpy_rhs_vvgf_lev00/wavetoynrpy_rhs_vvgf {2, 83, 83}
step 0:
block 0: [0:1, 1:81, 1:80]
double /data/4/meshes/wavetoynrpy_rhs_vvgf_lev01/wavetoynrpy_rhs_vvgf {3, 163, 163}
step 0:
block 0: [0:2, 33:129, 33:128]
double /data/4/meshes/wavetoynrpy_uugf_lev00/wavetoynrpy_uugf {2, 83, 83}
step 0:
block 0: [0:1, 1:81, 1:80]
double /data/4/meshes/wavetoynrpy_uugf_lev01/wavetoynrpy_uugf {3, 163, 163}
step 0:
block 0: [0:2, 33:129, 33:128]
double /data/4/meshes/wavetoynrpy_vvgf_lev00/wavetoynrpy_vvgf {2, 83, 83}
step 0:
block 0: [0:1, 1:81, 1:80]
double /data/4/meshes/wavetoynrpy_vvgf_lev01/wavetoynrpy_vvgf {3, 163, 163}
step 0:
block 0: [0:2, 33:129, 33:128]
I am able to retrieve this information with the openPMD-api:
>>> import openpmd_api as io
>>> s = io.Series("wave2dx.it00000004.bp5", io.Access.read_only)
>>> component = s.iterations[4].meshes['wavetoynrpy_uugf_lev01']['wavetoynrpy_uugf']
>>> component.available_chunks()
[<openPMD.WrittenChunkInfo of dimensionality 3'>]
>>> first_chunk = component.available_chunks()[0]
>>> first_chunk.offset
[0, 33, 33]
>>> first_chunk.extent
[3, 97, 96]
>>> data = component.load_chunk(first_chunk.offset, first_chunk.extent)
>>> s.flush()
>>> print(data)
[[[ 0.92511896 0.9339786 0.94348059 ... -0.94348059 -0.9339786
-0.92511896]
[ 0.93284628 0.94377486 0.95335251 ... -0.95335251 -0.94377486
-0.93284628]
[ 0.94010273 0.95221255 0.96128479 ... -0.96128479 -0.95221255
-0.94010273]
...
[-0.94010273 -0.95221255 -0.96128479 ... 0.96128479 0.95221255
0.94010273]
[-0.93284628 -0.94377486 -0.95335251 ... 0.95335251 0.94377486
0.93284628]
[-0.92511896 -0.9339786 -0.94348059 ... 0.94348059 0.9339786
0.92511896]]
[[ 0.92511896 0.9339786 0.94348059 ... -0.94348059 -0.9339786
-0.92511896]
[ 0.93284628 0.94377486 0.95335251 ... -0.95335251 -0.94377486
-0.93284628]
[ 0.94010273 0.95221255 0.96128479 ... -0.96128479 -0.95221255
-0.94010273]
...
[-0.94010273 -0.95221255 -0.96128479 ... 0.96128479 0.95221255
0.94010273]
[-0.93284628 -0.94377486 -0.95335251 ... 0.95335251 0.94377486
0.93284628]
[-0.92511896 -0.9339786 -0.94348059 ... 0.94348059 0.9339786
0.92511896]]
[[ 0.92511896 0.9339786 0.94348059 ... -0.94348059 -0.9339786
-0.92511896]
[ 0.93284628 0.94377486 0.95335251 ... -0.95335251 -0.94377486
-0.93284628]
[ 0.94010273 0.95221255 0.96128479 ... -0.96128479 -0.95221255
-0.94010273]
...
[-0.94010273 -0.95221255 -0.96128479 ... 0.96128479 0.95221255
0.94010273]
[-0.93284628 -0.94377486 -0.95335251 ... 0.95335251 0.94377486
0.93284628]
[-0.92511896 -0.9339786 -0.94348059 ... 0.94348059 0.9339786
0.92511896]]]
@franzpoeschel my code would have run if you'd used the cmd.sh script to do it... as it was, you tripped over an unimportant bug. In any event, your answer helped me figure out what I needed to do, although it took a little experimentation. If you want to see the updated version of the script, it's here: https://gist.githubusercontent.com/stevenrbrandt/ef3718d52680d8a4e6accbad198a0b42/raw/757aebb76ea49df472abaa1aac11f5ec49a5c916/plot-zslice.py
What do you want to achieve, please describe.
I have a dataset with mesh refinement. It's just a 2D wave equation test. The grid goes from 0 to 40 in both x and y.
Even though the refined part of the grid (lev01) is a small box around (x,y)=(20,20), the offset and spacing in the chunk describe the entire grid from 0 to 40 in x and y. The valid data (the small box) is clearly visible in the middle of the screen, and I see random noise everywhere else. I'd like to get the data from just the small box. Thanks.
Python:
or C++
You can also add images such as screenshots :-)
Software Environment: Have you already installed openPMD-api? yes If so, please tell us which version of openPMD-api your question is about: