HDF5 Writer limits grid resolution

lettucecfd / lettuce

Computational Fluid Dynamics based on PyTorch and the Lattice Boltzmann Method

MIT License

223 stars 39 forks source link

HDF5 Writer limits grid resolution #119

Closed robert-DL closed 3 months ago

robert-DL commented 2 years ago

Hi,

the current implementation of the HDF5 writer saves the flow object as metadata. However, this limits the grid size since the metadata should not exceed 64kb (if I remember correctly). E.g. for large resolutions on the obstacle flow, the mask is limiting the usage. (if my assumption is correct). A quick fix would be, to save the mask into a separate dataset and set it to None before pickling the flow. Therefore, the assertion test has to be removed from the mask setter. :)

Best Robert

Olllom commented 2 years ago

Oh no. My bad! Thanks for pointing this out. As a short term fix, let's make the saving and loading of the flow optional and disable it by default.

As a long term solution, let's think about how we can define a proper (de-)serialization of flows and put this in a separate class.

robert-DL commented 2 years ago

I think it would be convenient to store information about the mask by saving the indices where the mask equals one. Since the shape of the data is known, one can easily create the mask again. This reduces the overhead a lot and one does not need to have a switch attribute like save_mask=True. Also, it is just a minor modification. flow.mask = np.nonzero(flow.mask)

EDIT: I would not store any object into the hdf5. Instead, I would write metadata and datasets solely. The stencil, streaming, etc. can be named. The stencil velocities can be stored as metadata, same for the weights and cs. Also the simulation parameters such as the Reynold number and so on (although this information could be dumped to a text file). One can recover the simulation setup completely from this information, without storing the class. But this is probably a matter of taste.

PhiSpel commented 4 months ago

@Olllom this refers to the checkpoint, right? Is it necessary to use pickle? If I understand correctly, we do not dump the whole Simulation object (where pickle may make sense), we may as well use torch.save(filename) and torch.load(filename, dtype=lattice.dtype, device=lattice.device). Would this solve this issue?

Olllom commented 4 months ago

Yes. The serialization method discussed here was definitely a bad idea.

Writing checkpoints through torch.save makes sense (it uses pickle under the hood).