The QCSchema object from a PySCFDriver object is not JSON serializable, i.e. the to_json() method used on a QCSchema object from a PySCFDriver object results in the error TypeError: Object of type int64 is not JSON serializable.
How can we reproduce the issue?
The following code results in the error TypeError: Object of type int64 is not JSON serializable:
from qiskit_nature.units import DistanceUnit
from qiskit_nature.second_q.drivers import PySCFDriver
driver = PySCFDriver(
atom="H 0 0 0; H 0 0 0.735",
basis="sto3g",
charge=0,
spin=0,
unit=DistanceUnit.ANGSTROM,
)
problem = driver.run()
schema = driver.to_qcschema()
# Trying to convert QCSchema to JSON
schema.to_json()
Output:
Traceback (most recent call last):
File "pyscf_json.py", line 17, in <module>
schema.to_json()
File "../qiskit-nature/qiskit_nature/second_q/formats/qcschema/qc_base.py", line 67, in to_json
return json.dumps(self.to_dict(), indent=2)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/json/__init__.py", line 238, in dumps
**kw).encode(obj)
^^^^^^^^^^^
File "/usr/lib/python3.12/json/encoder.py", line 202, in encode
chunks = list(chunks)
^^^^^^^^^^^^
File "/usr/lib/python3.12/json/encoder.py", line 432, in _iterencode
yield from _iterencode_dict(o, _current_indent_level)
File "/usr/lib/python3.12/json/encoder.py", line 406, in _iterencode_dict
yield from chunks
File "/usr/lib/python3.12/json/encoder.py", line 406, in _iterencode_dict
yield from chunks
File "/usr/lib/python3.12/json/encoder.py", line 326, in _iterencode_list
yield from chunks
File "/usr/lib/python3.12/json/encoder.py", line 439, in _iterencode
o = _default(o)
^^^^^^^^^^^
File "/usr/lib/python3.12/json/encoder.py", line 180, in default
raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type int64 is not JSON serializable
What should happen?
I guess the PySCFDriver object should be JSON serializable and the code above should run without errors.
Any suggestions?
PySCF saves some properties as numpy scalars, e.g. the Mole.nao property which is extracted here in pyscfdriver.py and is of type numpy.int64. Theses numpy scalars are not further processed or transformed into native python types (such as int, float, etc.), e.g. in electronic_structure_driver.py or in pyscfdriver.py. This leads to the fact that some properties in the QCSchema object are not JSON serializable, since they are numpy scalars.
Also the PySCF property atom_mass_list is also wrongly converted to a python list here which results in a list of numpy scalars instead of a list of python types. Since atom_mass_list is a numpy ndarray the tolist method should be used in my opinion.
Further, it seems reasonable to me to add a unittest testing the to_json and to_hdf5 methods of the PySCFDriver. I am thinking about something like this in test_driver_pyscf.py:
def test_to_json(self):
"""Check JSON-serializability of the driver"""
driver = PySCFDriver(
atom="H .0 .0 .0; H .0 .0 0.735",
unit=DistanceUnit.ANGSTROM,
charge=0,
spin=0,
basis="sto3g",
)
_driver_result = driver.run()
schema = driver.to_qcschema()
schema.to_json()
def test_to_hdf5(self):
"""Check HDF5-serializability of the driver"""
driver = PySCFDriver(
atom="H .0 .0 .0; H .0 .0 0.735",
unit=DistanceUnit.ANGSTROM,
charge=0,
spin=0,
basis="sto3g",
)
_driver_result = driver.run()
schema = driver.to_qcschema()
with TemporaryDirectory() as tmp_dir:
file_path = Path(tmp_dir) / "tmp.hdf5"
with h5py.File(file_path, "w") as file:
schema.to_hdf5(file)
Please tell me your opinion on the suggested changes and I can prepare a pull request to resolve this issue, if you wish.
Environment
What is happening?
The QCSchema object from a PySCFDriver object is not JSON serializable, i.e. the
to_json()
method used on a QCSchema object from a PySCFDriver object results in the errorTypeError: Object of type int64 is not JSON serializable
.How can we reproduce the issue?
The following code results in the error
TypeError: Object of type int64 is not JSON serializable
:Output:
What should happen?
I guess the PySCFDriver object should be JSON serializable and the code above should run without errors.
Any suggestions?
PySCF saves some properties as numpy scalars, e.g. the Mole.nao property which is extracted here in pyscfdriver.py and is of type numpy.int64. Theses numpy scalars are not further processed or transformed into native python types (such as int, float, etc.), e.g. in electronic_structure_driver.py or in pyscfdriver.py. This leads to the fact that some properties in the QCSchema object are not JSON serializable, since they are numpy scalars. Also the PySCF property atom_mass_list is also wrongly converted to a python list here which results in a list of numpy scalars instead of a list of python types. Since atom_mass_list is a numpy ndarray the tolist method should be used in my opinion.
Therefore, I suggest the following changes to the files electronic_structure_driver.py and pyscfdriver.py: pyscfdriver.py line 558:
electronic_structure_driver.py line 227 onward:
electronic_structure_driver.py line 335:
Further, it seems reasonable to me to add a unittest testing the
to_json
andto_hdf5
methods of the PySCFDriver. I am thinking about something like this in test_driver_pyscf.py:Please tell me your opinion on the suggested changes and I can prepare a pull request to resolve this issue, if you wish.