Closed dreamer2368 closed 9 months ago
@dylan-copeland , After looking throughout the code, I'd like to suggest that we work on a separate PR for removing the concept of time intervals.
Even though the multiple time intervals are never used in practice, num_time_intervals
or d_num_intervals_written
are widely used in many classes and dataset names in hdf5 format. Since this concept of time interval has been so engrained into the code, it will take quite some time to remove it while keeping the consistency of the code. We will also have to discuss how we change the format of dataset name, which will have quite a big impact in backward compatibility.
@dylan-copeland , per our discussion, I added a commit that spits an error message whenever a user attempts to increase time interval, directly or indirectly through BasisGenerator
or SVD
class. Technically it still returns an error without this message, but this message will guide the user to increase the maximum number of samples.
Now unit_tests/test_SVD.cpp
is removed from ci workflow. This test checks nothing but the concept of time interval, which is obsolete from now on.
I thought about the data format of basis classes, and ended up keeping the current implementation, allowing both csv and hdf5 format. Currently both data formats are compatible, and no reason to remove a functionality that we can support without any complication.
With that, I think this PR is ready for review and merge.
@dylan-copeland ,
In my work for PR #274 , I ended up removing the concept of time interval. I moved those changes to here since it fits more to this PR.
While the inside of all functions do not use the concept of time intervals, some function signatures remain in order to keep the backward compatibility:
BasisGenerator::takeSample
used double time
only for determining the time interval. Now time
is a dummy variable and not used in the function.
BasisReader
used double time
for determining the time interval, but now time
is a dummy variable.
getSpatialBasis
getTemporalBasis
getSingularValues
getSnapshotMatrix
Not exactly sure if we'd want to change these function signatures right away, which will impact all applications using libROM. Need more discussion about this @chldkdtn @ckendrick
Dataset names for basis are now updated, removing the trailing 6 digits of integer that used to indicate time interval.
Any out-dated snapshot/basis hdf5 files can be updated using scripts/data/update_basis_formats.py
. The syntax for using this python script is:
python3 update_basis_format.py filename
This basis/snapshot format update is tested with unit_tests/test_basis_conversion.cpp
. In .github/workflows/run_tests/action.yml
, outdated basis/snapshot files are first converted with the python script, and checked if the BasisReader
class can read the updated files.
@ebchin ,
After some discussion throughout this PR, we decided to change the dataset names within snapshot/basis hdf5 files. I heard from @dylan-copeland that you might have data files for your production and your simulation can be impacted by this update. I prepared a python script scripts/data/update_basis_formats.py
, which can update the dataset names of the out-dated hdf5 snapshot/basis files. This script is also tested in the ci workflow. Please feel free to bring up any suggestion/concerns.
Closing files at the end of
BasisWriter::writeBasis
This PR revamps #150 . Now closing the database file is not associated with the destruction of
BasisWriter
. This enables writing both the basis and the snapshot at a given moment. As far as I believe, this should keep the backward compatibility as well, but it needs other people's review as we do not have a regression test to check this.Database
class prints out the message when creating/opening a file.BasisWriter
class initializes/destroys the memberDatabase* d_database
andDatabase* d_snap_database
at its initialization/destruction, respectively. This, however, does not involve a file creation/opening/close.putInteger
operation in~BasisWriter
is moved intoBasisWriter::writeBasis
.BasisWriter::writeBasis
now closes the file at the end of the function call.Removal of time interval
The concept of time interval is obsolete, and it hasn't been used in practice. This is removed, addressing #267 . While the inside of all functions do not use the concept of time intervals, some function signatures remain in order to keep the backward compatibility:
BasisGenerator::takeSample
useddouble time
only for determining the time interval. Nowtime
is a dummy variable and not used in the function.BasisReader
useddouble time
for determining the time interval, but nowtime
is a dummy variable.getSpatialBasis
getTemporalBasis
getSingularValues
getSnapshotMatrix
The concept of time interval is obsolete, and it hasn't been used in practice. This will be removed in future (#267 ), and for now we enforce the number of interval not to be more than 1.Options
class will return error ifmax_time_intervals > 1
. This will be kept until we actually remove the time interval concept.BasisWriter::writeBasis
assumes to newly create the file every time it is called.HDF5Database::putIntegerArray
does not overwrite the dataset. If the datasetnum_time_intervals
is attempted to be re-written again, then an error will be returned.