aiidateam / aiida-wannier90

AiiDA plugin for the Wannier90 code
https://aiida-wannier90.readthedocs.io
Other
9 stars 15 forks source link

Storage of mmn file (and other large files) #101

Closed greschd closed 4 years ago

greschd commented 4 years ago

I find myself needing to store the .mmn file, which can become quite large - and other large files, but that is the main culprit.

A quick test showed that by converting to a numpy file (as ArrayData would do) reduces the file size by roughly a factor of 2.

Since the Mmn files are also redundant (M^(k,k+b) is the adjoint of M^(k+b,k)), another factor of two could be saved by clever conversion. Similarly, the _hr.dat is redundant because H[R] is the adjoint of H[-R].

@giovannipizzi what's your opinion on this? Would it be desirable to always parse these files when we retrieve them? Of course that would mean we also need a method to write them to pass to the next calculation.

This all would probably complicate the Wannier90Calculation quite a bit. I could also see this being a worthwile project for Wannier90 itself - but I definitely don't have enough knowledge of the code to know if this makes sense.

giovannipizzi commented 4 years ago

I have mixed feelings - I fear the risk of introducing bugs is large at a small advantage for the user. In particular: if you just zip the file, most probably you get a larger gain? My guess is a factor of 3 or 4 reduction, so in this case the simplest thing would be to just zip the file (or wait for the new AiiDA repository that could come with zipping ;-) very happy if you want to participate in the discussions about that!)

greschd commented 4 years ago

Yeah, without crazy compression levels (which are slow) that ends up being roughly a factor 3 - significantly bigger (~1.5x) than HDF5 taking advantage of the structure of the matrix, but not egregiously so.

I think I agree on the conclusion (not worth it) - what I do wonder is whether it would make sense to use the symmetry of Mmn in pw2wannier90.x. In large systems, calculating the overlaps can actually take quite a while for me. Not really dominating in terms of the whole workflow, but still...

Anyway, I think I can close this.

giovannipizzi commented 4 years ago

Re symmetry: yes, we are starting to look into that (symmetry in pw2wannier90). There are various things that can be done, they require a bit of work though. Contact me in private if you want to contribute!