MolSSI / QCFractal

A distributed compute and database platform for quantum chemistry.
https://molssi.github.io/QCFractal/
BSD 3-Clause "New" or "Revised" License
148 stars 48 forks source link

Dataset: fix bug revealed by pandas 1.0 #561

Closed mattwelborn closed 4 years ago

mattwelborn commented 4 years ago

Description

CI is failing. This seems to be caused by df = df.groupby(["name"])["return_result"].sum(skipna=True). It appears that skipna was never a valid kwarg of groupby().sum(), but that it was not checked until pandas 1.0. This PR removes skipna=True; the default behavior is to skip NaNs already.

Changelog description

Fixed a bug that caused errors with pandas v1.0.

Status

dgasmith commented 4 years ago

Can we bump minimum versions?

mattwelborn commented 4 years ago

Can we bump minimum versions?

Of pandas? This PR should work with pre-1.0 pandas just fine. skipna was silently ignored in pre-1.0 pandas, so the behavior is the same.

mattwelborn commented 4 years ago

@dgasmith Any idea what might be going on here:

_____________________________ test_mol_pagination ______________________________
storage_socket = <qcfractal.storage_sockets.sqlalchemy_socket.SQLAlchemySocket object at 0x7f5a1daab518>
    def test_mol_pagination(storage_socket):
        """
            Test Molecule pagination
        """

        assert len(storage_socket.get_molecules()["data"]) == 0
        mol_names = [
            "water_dimer_minima.psimol",
            "water_dimer_stretch.psimol",
            "water_dimer_stretch2.psimol",
            "neon_tetramer.psimol",
        ]

        total = len(mol_names)
        molecules = []
        for mol_name in mol_names:
            mol = ptl.data.get_molecule(mol_name)
            molecules.append(mol)

        inserted = storage_socket.add_molecules(molecules)

        assert inserted["meta"]["n_inserted"] == total

>       ret = storage_socket.get_molecules(skip=1)
qcfractal/tests/test_storage.py:1198: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
qcfractal/storage_sockets/sqlalchemy_socket.py:732: in get_molecules
    MoleculeORM, query, limit=limit, skip=skip, exclude=["molecule_hash", "molecular_formula"]
qcfractal/storage_sockets/sqlalchemy_socket.py:382: in get_query_projection
    rdata = [dict(zip(_projection, row)) for row in data]
qcfractal/storage_sockets/sqlalchemy_socket.py:382: in <listcomp>
    rdata = [dict(zip(_projection, row)) for row in data]
../../../miniconda/envs/test/lib/python3.6/site-packages/sqlalchemy/orm/loading.py:101: in instances
    util.raise_from_cause(err)
../../../miniconda/envs/test/lib/python3.6/site-packages/sqlalchemy/util/compat.py:398: in raise_from_cause
    reraise(type(exception), exception, tb=exc_tb, cause=cause)
../../../miniconda/envs/test/lib/python3.6/site-packages/sqlalchemy/util/compat.py:153: in reraise
    raise value
../../../miniconda/envs/test/lib/python3.6/site-packages/sqlalchemy/orm/loading.py:85: in instances
    for row in fetch
../../../miniconda/envs/test/lib/python3.6/site-packages/sqlalchemy/orm/loading.py:85: in <listcomp>
    for row in fetch
../../../miniconda/envs/test/lib/python3.6/site-packages/sqlalchemy/orm/loading.py:84: in <listcomp>
    keyed_tuple([proc(row) for proc in process])
../../../miniconda/envs/test/lib/python3.6/site-packages/sqlalchemy/sql/type_api.py:1266: in process
    return process_value(impl_processor(value), dialect)
qcfractal/storage_sockets/models/sql_base.py:24: in process_result_value
    return msgpackext_loads(value)
../../../miniconda/envs/test/lib/python3.6/site-packages/qcelemental/util/serialization.py:113: in msgpackext_loads
    return msgpack.loads(data, object_hook=msgpackext_decode, raw=False)
msgpack/_unpacker.pyx:184: in msgpack._cmsgpack.unpackb
    ???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
>   ???
E   TypeError: a bytes-like object is required, not 'NoneType'
msgpack/_unpacker.pyx:135: TypeError
dgasmith commented 4 years ago

@mattwelborn Yea, sparse molecule bug.

@doaa-altarawy What do you think of the fix?

mattwelborn commented 4 years ago

@dgasmith @doaa-altarawy I think we're back to database bugs now.

codecov[bot] commented 4 years ago

Codecov Report

Merging #561 into master will decrease coverage by 3.16%. The diff coverage is 94.11%.

dgasmith commented 4 years ago

@mattwelborn Can you reproduce the single failure? I cannot seem to manage it locally.