ProjectQ-Framework / FermiLib

FermiLib: Open source software for analyzing fermionic quantum simulation algorithms
https://projectq.ch/
Apache License 2.0
86 stars 38 forks source link

Data format update #66

Closed jarrodmcc closed 7 years ago

jarrodmcc commented 7 years ago

Updates the molecular data storage in a few small ways.

  1. All data is now kept in the HDF5 format and loaded when desired, rather than keeping auxiliary files
  2. The ordering of CCSD amplitudes was changed to make more sense.
  3. The data in the data directory is now more easily automatically generated with associated plugins.

Tests were updated to accommodate these changes and this change should be accepted in concert with a change to the plugins that write to this format.

jarrodmcc commented 7 years ago

Seems a particular string comparison is not Python 3 compatible, I'll fix that and add a test to up the coverage. Also marking this as relevant to #23

jarrodmcc commented 7 years ago

The difficulty of using strings in HDF5 with Python3 is apparently somewhat well known and irritating. Values are always returned as byte types that must be manually decoded into an ascii compatible format, of which utf-8 is a superset (technically unicode). I found a workable solution, though optimal ones seem to be the subject of debate. For example see: https://github.com/PyTables/PyTables/issues/499 https://github.com/h5py/h5py/pull/871 http://docs.h5py.org/en/latest/strings.html

damiansteiger commented 7 years ago

Thanks for the update that everything is now saved in HDF5 👍

The string business is a bit annoying.

When originally implementing it, I had similar issues with strings:

Creating data in Python2.7 and writing it to HDF5 and loading it again in Python3.3+. Therefore, I used from __future__ import unicode_literals (see a discussion here http://python-future.org/unicode_literals.html) to globally change python2 str to unicode so that we have the same string type in Python2 or Python3 in this one file. It is still like this. I hope it won't create any unintended problem...