netket / netket

Machine learning algorithms for many-body quantum systems
https://www.netket.org
Apache License 2.0
547 stars 188 forks source link

Add support for a single-file HDF5 log #1200

Closed femtobit closed 2 years ago

femtobit commented 2 years ago

[This issue is part of UnitaryHack and comes with a bounty of 75$]

Context

NetKet simulation drivers support output of the current state of an optimization as well as expectation values of observables and custom data via the classes provided in netket.logging.

Currently, NetKet has two main logging implementations:

  1. JsonLog, which is the standard logger and writes log data to a JSON file (and also saves a regularly overwritten snapshot of the current network parameters as a MessagePack file).
  2. StateLog, which saves intermediate network parameters as a separate file for each step (1.mpack, 2.mpack, 3.mpack, etc.) to a folder or ZIP file.

While these work, it would be nice for easier data handling and interoperability with other tools to support writing simulation output into a single file in the commonly used HDF5 format via h5py.

Implementation notes

To resolve this issue, the following should be implemented:

  1. A new logger class netket.logging.HDF5Log which writes both the information currently contained in JsonLog and the network parameters (at each step or every certain number of steps as it can be configured in StateLog) to an HDF5 file specified by the user.
  2. The logger needs to be compatible with the current NetKet logging interface, so that it can be used in place of JsongLog and StateLog in NetKet drivers. (For this PR this means compatibility with the current use of the loggers in netket.driver.AbstractVariationalDriver.) Specifically:
    • The logger needs to implement __call__(self, step, log_data, state) which is called at each optimization step and write the provided data to the HDF5 log file.
    • Since the number of steps (i.e., number of calls to __call__) is not known before the start of the simulation, the log must support appending data to the log every at every step (therefore, the datasets within the HDF5 file need to be resized as necessary).

Note that log_data is a dictionary mapping a name to a specific logged quantity. The value can be of several different types. The HDF5Log should support scalar numbers, NumPy/JAX arrays, and netket.stats.Stats objects.

NetKet stores network parameters as JAX pytrees with leaves being complex-valued or real-valued arrays. The HDF5Log should store a flattened version (as returned by netket.jax.tree_ravel) in a single dataset of shape (n_steps, n_parameters).

Here is an example layout, showing what a resulting HDF5 log file should contain after 1001 logging steps for a network with 256 variational parameters:

Name                            Data type       Shape
# network parameters:
/parameters/iters               int             (1001,)
/parameters/values              complex128      (1001, 256)
# an entry of `log_data` called `Energy` of type `netket.stats.Stats`
/data/Energy/iters              int             (1001,)
/data/Energy/mean               complex128      (1001,)
/data/Energy/error_of_mean      float64         (1001,)
/data/Energy/variance           float64         (1001,)
/data/Energy/tau_corr           float64         (1001,)
/data/Energy/R_hat              float64         (1001,)
# an entry of `log_data` called `S_x` of type `netket.stats.Stats`
/data/S_x/iters                 int             (1001,)
/data/S_x/mean                  complex128      (1001,)
/data/S_x/error_of_mean         float64         (1001,)
/data/S_x/variance              float64         (1001,)
/data/S_x/tau_corr              float64         (1001,)
/data/S_x/R_hat                 float64         (1001,)
# an entry of `log_data` called `acceptance` of type `float64`
/data/acceptance/iters          int             (1001,)
/data/acceptance/values         float64         (1001,)

Note that for a normal netket.VMC run, there is a lot of redundancy in the .../iters arrays (as they will all be equal and of the form [0, 1, ..., n_steps - 1]). We accept this overhead both for compatibility with the existing JsongLog and for the added flexibility it provides for custom logging at subsets of steps.

maxbortone commented 2 years ago

I've just signed for the Unitary-Hack and saw this issue. As it's something I've wanted to implement myself a while ago, I already have a basic HDF5 logger working. I would need to clean it up a little, make sure it does what is required from the description above and add some logic so that data gets flushed after a certain amount of steps, as is done in the JsonLog. Is it ok if I do this and push a PR once it's ready?

gcarleo commented 2 years ago

that would be great, @PhilipVinc will tell you if you need to register specifically for this bounty and "reserve" it for you

PhilipVinc commented 2 years ago

HI @maxbortone , yes please pick up the bounty! Let me know if you need any guidance on it.

If you open a PR earlier rather than later it's easier for us to keep an eye on it, but feel free to work on it as you prefer.

maxbortone commented 2 years ago

All right, thanks! I'll prepare a PR today