Closed benchislett closed 5 years ago
This PR removes the Makefile completely for a pure CMake build, right? Right.
I haven't had much time to look at the changes in here, but from what I have seen things look good to me. As James said earlier, ensuring backwards compatibility is a good idea, but an additional file for processing to ASCII data to HDF is possible (something in Python might be quick and easy for this).
As for performance, AFAIK we don't have any regression testing anywhere (CI/CD is hard for CUDA after all, and expensive). But have there been any runs of the original code and this version matching apples to apples simulation performance?
I am getting a roughly 25% speedup. The more IO we do, the faster this is.
I am currently trying to replicate some 3D results, but I am having some trouble. Not sure if this is on my end or if there is something going wrong with the output file. I'm going to look into this more tomorrow.
I think having a script to allow for the user to easily generate the H5 input is essential. I'll also look at this tomorrow. It shouldn't be too hard.
So I tested inputting an incomplete set of data (that might be created from a separate program) by writing the following python script:
import h5py
import numpy as np
def write_wfc(data_im, data_real, xDim, yDim, zDim, gstate,
wfc_idx, i, output_filename="../data/output.h5"):
f = h5py.File(output_filename, "w")
lines_real = np.loadtxt(data_real)
lines_im = np.loadtxt(data_im)
print(len(lines_real))
wfc_real = np.reshape(lines_real, (xDim,yDim,zDim));
wfc_im = np.reshape(lines_im, (xDim,yDim, zDim));
wfc = wfc_real + 1j * wfc_im
f["/WFC/{}/{}".format("CONST" if gstate else "EV", i)] = wfc
write_wfc("../data_2D_example/wfc_evi_0", "../data_2D_example/wfc_ev_0",
512, 512, 1, False, 0, 0,
output_filename="../data_2D_example/output.h5")
After running ./gpue -x 512 -y 512 -g 10001 -p 10000 -e 1 -W -r "data_2D_example/output.h5"
, I got the following error:
Argument for x is given as 512
Argument for y is given as 512
Argument for Groundsteps is given as 1.000100E+04
Argument for Printout is given as 10000
Argument for EvSteps is given as 1.000000E+00
Writing out
Reading data from file.
Start: Wed Aug 21 15:09:15 2019
Loading attribute gSize with value 65536
Loading attribute plan_1d with value 2
Loading attribute plan_2d with value 1
Loading attribute plan_3d with value 4
Loading attribute plan_dim2 with value 1585669764
Loading attribute plan_dim3 with value 22038
Loading attribute plan_other2d with value 3
Loading attribute zDim with value 1
ERROR: could not find string gstate in Grid::param_bool.
Segmentation fault
I guess this is because we don't default to imaginary-time evolution? Most other defaults seem to be setting alright.
Ok, I am happy to merge this now. I think everyone has had time to have their say.
The Goal
Have file output be done in a way that is both compact, and easy to use. Storing data as strings is suboptimal, and writing out many files per timestep is very inefficient.
The Proposal
By using HDF5, we can efficiently store and even compress the data on disk. Further, there are bindings for I/O in many popular languages, so ease of access will not be greatly reduced.
The Method
At compile time, pull down hdf5 and install it automatically within the repo. Then, link to the binaries with CMake. Finally, rewrite the FileIO namespace to write all data into a single output file, which is closed when the program terminates.
The Implementation
CMake and Refactor
@mlxd has refactored the file structure and CMake build process in the branch
cmake_build
. These changes were merged into this branch very early on, as they were essential to the hdf5 build process.HDF5 Installation
The build script is a simple bash script stored in the
bin
folder that does the following:wget
the HDF5 archivebin
folderThe directory structure is directly linked in
CMakeLists.txt
and HDF5 is added withfind_package
FileIO Restructuring
Output
Instead of writing out data with generically typed
writeOut{Int,Bool,Double,...}
functions, we create a generic helper function to write a dataset to the output file and wrap it in functions for each element we want to write. i.e.write1d
writes a generic dataset from contiguous memory,writeNd
reshapes a vector of pointer arrays into a contiguous block and defers towrite1d
, andwriteOutWfc
and the like all wrapwriteNd
.The structure of the output file is similar to that of a directory system, with Groups analogous to folders, and DataSets to files.
There are also attributes stored at root and in the
ENERGY
group, representing thepar
object and the energy at a given timestep respectively.Input
In order to continue simulations from an output file, we load the hfd5 data instead of creating it. Groups specifically are loaded from the file initially, along with the
par
attributes, and the most recentwfc
. Later on, theA
series (Ax
,Ay
,Az
) are loaded when needed.When writing to a DataSet, we now need a way to know if we should open from the file (when loading) or create a new one. This is determined by checking the existence of the DataSet in the file, and conditionally loading either of them. For simplicity, we also store an
unordered_map
of the DataSets so that we don't have to open/close them each time.Unchanged functionality
Some vortex manipulation still writes data directly to a file, untouched by this feature. This is because of the dragons which guard the code, and the legacy functionality. Only the edges are currently written to the hdf5 output file, though this may change in future releases.
Additional Changes
In order for the loading to work correctly, some changes were made to the
Grid
class. Namely, we now store a default mapping for the default values of some parameters, which are checked iff the value does not exist in the central map. This way, we can implement astore_only
method which stores a value iff it has not previously been set. With this, we can load the parameters from file after parsing without fear of overwriting explicitly set values.Also,
gstate
has been changed from an integer to a boolean, and the resulting fallout has been fixed so that the new representation (true=imaginary, false=real) is used.A separate iteration counter for both evolution types has been added, which tracks the most recent iteration number for each evolution type. This is used primarily for continuation of previous simulations via the
-r
flag, but is also a handy marker to know where we are in the simulation, and could be used in the future for monitoring execution and performance. Because of this, and the new way of continuing simulations,step_offset
has been removed entirely.