Closed greschd closed 4 years ago
I have mixed feelings - I fear the risk of introducing bugs is large at a small advantage for the user. In particular: if you just zip the file, most probably you get a larger gain? My guess is a factor of 3 or 4 reduction, so in this case the simplest thing would be to just zip the file (or wait for the new AiiDA repository that could come with zipping ;-) very happy if you want to participate in the discussions about that!)
Yeah, without crazy compression levels (which are slow) that ends up being roughly a factor 3 - significantly bigger (~1.5x) than HDF5 taking advantage of the structure of the matrix, but not egregiously so.
I think I agree on the conclusion (not worth it) - what I do wonder is whether it would make sense to use the symmetry of Mmn in pw2wannier90.x
. In large systems, calculating the overlaps can actually take quite a while for me. Not really dominating in terms of the whole workflow, but still...
Anyway, I think I can close this.
Re symmetry: yes, we are starting to look into that (symmetry in pw2wannier90). There are various things that can be done, they require a bit of work though. Contact me in private if you want to contribute!
I find myself needing to store the
.mmn
file, which can become quite large - and other large files, but that is the main culprit.A quick test showed that by converting to a numpy file (as
ArrayData
would do) reduces the file size by roughly a factor of 2.Since the Mmn files are also redundant (M^(k,k+b) is the adjoint of M^(k+b,k)), another factor of two could be saved by clever conversion. Similarly, the
_hr.dat
is redundant because H[R] is the adjoint of H[-R].@giovannipizzi what's your opinion on this? Would it be desirable to always parse these files when we retrieve them? Of course that would mean we also need a method to write them to pass to the next calculation.
This all would probably complicate the
Wannier90Calculation
quite a bit. I could also see this being a worthwile project for Wannier90 itself - but I definitely don't have enough knowledge of the code to know if this makes sense.