DedalusProject / dedalus

A flexible framework for solving PDEs with modern spectral methods.
http://dedalus-project.org/
GNU General Public License v3.0
513 stars 121 forks source link

[d3] Merging virtual files into single files #182

Closed evanhanders closed 2 years ago

evanhanders commented 2 years ago

Virtual files (and the partial files they read) are starting to cause me issues with my filesystem quotas on Pleiades. I've created a little script based on d2's merging logic that merges a virtual file into a single, merged dataset (attached here, in .txt format: merge_virtual_files.py.txt).

If there's interest, I can improve this and work towards a pull request to put this into d3's tools/post.py file.

kburns commented 2 years ago

Yeah it would definitely be nice to have some flavor of this, it's just a shame that it (and the current FileHandler in geneeral) requires so much boilerplate -- as far as I've seen, there's no simpler copy mechanism that can merge together virtual datasets. I hate merging in general so it would be really nice to get parallel HDF5 or another parallel-write system like zarr working, maybe as separate FileHandler subclasses.

evanhanders commented 2 years ago

I agree, it would be great to have parallel file-writes (and if I understand what has been the issues with getting that working, I would probably be willing to lead the effort on implementing that over the next few months).

But in the short-term, I guess I'm wondering if I should clean this up a bit for integration into d3? I think a bit of boilerplate can be removed / streamlined. Otherwise, I can just post this .py script as a tool to the user group for people who are interested? Just trying to figure out the right way to move forward / recover this functionality for people who need a tool like this.

kburns commented 2 years ago

Yeah it would be great to have a PR for adding this, thanks!

evanhanders commented 2 years ago

Sounds good! I'll work on getting a preliminary PR together this week or so.