LLNL / Silo

Mesh and Field I/O Library and Scientific Database
https://silo.llnl.gov
Other
30 stars 23 forks source link

min_extents and max_extents mangled with Python interface #378

Open gregfi opened 5 months ago

gregfi commented 5 months ago

I'm attempting to use the Python interface to write a new Silo file that contains modifications to an existing Silo file. I'm using Python 3.11 and Silo 4.11.1. My code looks something like the following:

importedSilo = Silo.Open('input.silo', Silo.DB_READ)
outputSilo = Silo.Create('output.silo', 'test_output', Silo.DB_HDF5, Silo.DB_CLOBBER)   

mesh = importedSilo.GetVarInfo('mesh', 1)
outputSilo.WriteObject('mesh', mesh)

importedSilo.Close()
outputSilo.Close()                                                             

When I inspect the input and output data sets, for the min_extents and max_extents metadata, I see:

INPUT SILO:
  min_extents: (0.0, 0.0, 0.0)
  max_extents: (126.88, 6.283185307179586, 443.0)

OUTPUT SILO:
  min_extents1: 0.0
  min_extents2: 0.0
  min_extents3: 0.0
  max_extents1: 126.88
  max_extents2: 5.79923737063e-313
  max_extents3: 1.230757558913e-312

Is there a way to preserve min_extents and max_extents data?

markcmiller86 commented 5 months ago

Hmmm...I am unable to duplicate on python 2 or 3 using globe.silo example data. Can you try duplicating with that dataset?

globe.silo.gz

If it is practical, can you attach your input.silo here?

gregfi commented 5 months ago

Your example seems to work fine for me. Do you have anything with a DBquadmesh?

I've attached a small example that reproduces my issue.

box_in_a_box.silo.gz

(Interestingly, instead of a tuple for min_extents and max_extents, my small example seems to have min_extents1, min_extents2, etc.)

When I perform the conversion and inspect with my example, I see:

INPUT SILO:
  min_extents1: -4.0
  min_extents2: -4.0
  min_extents3: -4.0
  max_extents1: 4.0
  max_extents2: 4.0
  max_extents3: 4.0

OUTPUT SILO:
  min_extents1: -4.0
  min_extents2: 0.0
  min_extents3: 0.0
  max_extents1: 4.0
  max_extents2: 0.0
  max_extents3: 0.0
markcmiller86 commented 5 months ago

Do you have anything with a DBquadmesh?

Sure. Thats a good hint. Lemme go try a database that has a quadmesh.

markcmiller86 commented 5 months ago

Ok, thanks. That was it. I am duplicating the issue now. Looks like a bug in the python driver code.

As a work-around, you can write the extents as raw data to the file and after you read back the mesh itself (with the bad extents), you can then overwrite those values with the raw data extents you wrote to the file separately. Does that make sense? Its non-ideal but this is a clear bug that won't be fixed until you have the next release of Silo...or the patch I commit that you can build from.

gregfi commented 5 months ago

I think I understand. Basically, doing:

mesh = importedSilo.GetVarInfo('mesh', 1)
mesh['min_extents'] = (-4.0, -4.0, -4.0)
mesh['max_extents'] = (4.0, 4.0, 4.0)
mesh.pop('min_extents1', None)
mesh.pop('min_extents2', None)
mesh.pop('min_extents3', None)
mesh.pop('max_extents1', None)
mesh.pop('max_extents2', None)
mesh.pop('max_extents3', None)

Something like that?

markcmiller86 commented 5 months ago

I don't think I mean that quite. What I mean is use...

outputSilo.Write("/mesh_min_extents", mesh['min_extents'])
outputSilo.Write("/mesh_min_extents", mesh['min_extents'])

Then on readback, do this...

mesh = outputSilo.GetVarInfo("mesh",1)
mesh['min_extents'] = outputSilo.GetVar("/mesh_min_extents")
mesh['max_extents'] = outputSilo.GetVar("/mesh_max_extents")

So, on read back, you initially get bad data back in the mesh for min/max extents but then you go read the data you wrote to the file to correct that. Of course, outputSilo in my read-back code is at that point the file you created and want to read back somewhere.

gregfi commented 5 months ago

Ah, ok, but in this case there's a downstream code that is going to be reading the Silo file and looking at the min_extents and max_extents metadata. If that's the case, the work-around above won't work unless we modify the downstream code, correct?

markcmiller86 commented 5 months ago

Ah, ok, but in this case there's a downstream code that is going to be reading the Silo file and looking at the min_extents and max_extents metadata. If that's the case, the work-around above won't work unless we modify the downstream code, correct?

Well, yeah, that is correct. I looked and the problem is in the python writer code. So, the file itself has garbage data for these values. If you are reading through the C interface in the downstream code, the caller will get garbage data for those extents. If you are reading throug the python interface in the downstream code, its a couple of extra lines to "patch up" the problem.

Lets see...I can think of another approach that would fix the written file...using h5py to overwrite the values stored in the file. That way, anything downstream will be none the wiser.

Again, this is all just a work-around before I get the Silo code fixed. I can probably get you a patch you could try by the end of the week.

markcmiller86 commented 5 months ago

Lets see...I can think of another approach that would fix the written file...using h5py to overwrite the values stored in the file. That way, anything downstream will be none the wiser.

You could also use Silo's browser tool to do that...its probably easier that way...have the python code turn around and exec a shell command using browser that fixes those extents in the written file before you give it to anyone. I will demonstrate with an example shortly.

markcmiller86 commented 5 months ago

Sorry..I have been delayed with other fires. Hope to reply by tomorrow.

markcmiller86 commented 5 months ago

I am noticing if you switch to the PDB driver, the issue is not happening. Can you do that?

markcmiller86 commented 5 months ago

Ok, here is the code I was trying to make work. I generalized it a bit from what you had. But, this is how you could correct the created file using browser. I was running this in a lib of a Silo install. That said, it looks like HDF driver is still having trouble getting those extents correct. I think the only solution (apart from a fix to silo) is to fix via lower-level such as h5py to correct the written file.

import Silo
import subprocess

# Open original file and get mesh and extents
db = Silo.Open("/Users/miller86/visit/visit/data/silo_pdb_test_data/rect3d.silo")
print(db.GetToc())
mesh = db.GetVarInfo("quadmesh3d",1)
original_mins = mesh['min_extents']
original_maxs = mesh['max_extents']
print("original_mins", original_mins)
print("original_maxs", original_maxs)

# Open output file and write the mesh to it (it will wind up corrupting extents)
driver = Silo.DB_HDF5
outputSilo = Silo.Create('output.silo', 'test_output', driver, Silo.DB_CLOBBER)   
outputSilo.WriteObject('mesh', mesh)
db.Close()
outputSilo.Close()

# Open file and read mesh and confirm corrupted extents
db = Silo.Open("output.silo")
mesh = db.GetVarInfo("mesh",1)
if driver == Silo.DB_HDF5:
    print(mesh['min_extents1'], mesh['min_extents2'], mesh['min_extents3'])
    print(mesh['max_extents1'], mesh['max_extents2'], mesh['max_extents3'])
else:
    print(mesh['min_extents'])
    print(mesh['max_extents'])
db.Close()

# Use Silo's browser to fix corrupted textents
command = ["../bin/browser"]
command += ["-W"] # open file for write
if driver == Silo.DB_HDF5:
    command += ["-l", "1"] # set low-level variable to level 1
    command += ["-e", "mesh.min_extents1=%g"%original_mins[0]] # overwrite min1
    command += ["-e", "mesh.min_extents2=%g"%original_mins[1]] # overwrite min2
    command += ["-e", "mesh.min_extents3=%g"%original_mins[2]] # overwrite min2
    command += ["-e", "mesh.max_extents1=%g"%original_maxs[0]] # overwrite min1
    command += ["-e", "mesh.max_extents2=%g"%original_maxs[1]] # overwrite min2
    command += ["-e", "mesh.max_extents3=%g"%original_maxs[2]] # overwrite min2
    command += ["output.silo"]
else:
    command += ["-e", "mesh_min_extents={%g,%g,%g}"%original_mins] # overwrite mins
    command += ["-e", "mesh_max_extents={%g,%g,%g}"%original_maxs] # overwrite mins
print(command)
result = subprocess.run(command, capture_output=True, text=True)
print("Return code:", result.returncode)
print("Output:", result.stdout)
print("Error (if any):", result.stderr)

# Open file and read mesh and confirm correct extents
db = Silo.Open("output.silo")
mesh = db.GetVarInfo("mesh",1)
if driver == Silo.DB_HDF5:
    print(mesh['min_extents1'], mesh['min_extents2'], mesh['min_extents3'])
    print(mesh['max_extents1'], mesh['max_extents2'], mesh['max_extents3'])
else:
    print(mesh['min_extents'])
    print(mesh['max_extents'])
db.Close()
markcmiller86 commented 5 months ago

Ok, I figured out how to do it same with h5py...

import Silo
import h5py

# Open original file and get mesh and extents
db = Silo.Open("/Users/miller86/visit/visit/data/silo_pdb_test_data/rect3d.silo")
print(db.GetToc())
mesh = db.GetVarInfo("quadmesh3d",1)
original_mins = mesh['min_extents']
original_maxs = mesh['max_extents']
print("original_mins", original_mins)
print("original_maxs", original_maxs)

# Open output file and write the mesh to it (it will wind up corrupting extents)
outputSilo = Silo.Create('output.silo', 'test_output', Silo.DB_HDF5, Silo.DB_CLOBBER)   
outputSilo.WriteObject('mesh', mesh)
db.Close()
outputSilo.Close()

# Open file and read mesh and confirm corrupted extents
db = Silo.Open("output.silo")
mesh = db.GetVarInfo("mesh",1)
print(mesh['min_extents1'], mesh['min_extents2'], mesh['min_extents3'])
print(mesh['max_extents1'], mesh['max_extents2'], mesh['max_extents3'])
db.Close()

# Open the HDF5 file in read-write mode with h5py
file_path = 'output.silo'
with h5py.File(file_path, 'r+') as hdf5_file:

    # Access the dataset or group containing the "mesh" datatype
    mesh_group = hdf5_file['mesh']

    # Read the "silo" attribute
    silo_scalar = mesh_group.attrs['silo']
    print(f"Original silo scalar value: {silo_scalar}")
    modified_silo_scalar = silo_scalar
    modified_silo_scalar[4] = original_maxs

    # Write the modified "silo scalar" attribute back to the file
    mesh_group.attrs['silo'] = modified_silo_scalar
    print(f"Modified silo scalar value: {mesh_group.attrs['silo']}")

# The file is automatically closed when exiting the with block

# Open file and read mesh and confirm correct extents
db = Silo.Open("output.silo")
mesh = db.GetVarInfo("mesh",1)
print(mesh['min_extents1'], mesh['min_extents2'], mesh['min_extents3'])
print(mesh['max_extents1'], mesh['max_extents2'], mesh['max_extents3'])
db.Close()