Closed peastman closed 1 year ago
Here's a first pass at a script for submitting datasets (not yet tested).
from qcportal import PortalClient
from qcportal.singlepoint import QCSpecification, SinglepointDatasetNewEntry
from qcportal.molecules import Molecule
from openmm.unit import nanometer, bohr
import openff.toolkit
import openff.units
import numpy as np
import h5py
import sys
dataset_name = sys.argv[1]
filename = sys.argv[2]
input_file = h5py.File(filename)
client = PortalClient.from_file()
scale = (1*nanometer).value_in_unit(bohr)
keywords = {'maxiter': 200,
'scf_properties': ['dipole', 'quadrupole', 'wiberg lowdin indices', 'mayer indices', 'mbis charges', 'mbis dipoles', 'mbis quadrupoles', 'mbis octupoles'],
'wcombine': False}
spec = QCSpecification(program='psi4', driver='gradient', method='wb97m-d3bj', basis='def2-tzvppd', keywords=keywords)
dataset = client.add_dataset('singlepoint', dataset_name)
dataset.add_specification('wb97m-d3bj/def2-tzvppd', spec)
for group in input_file:
smiles = input_file[group]['smiles'].asstr()[0]
conformations = np.array(input_file[group]['conformations'])*scale
ffmol = openff.toolkit.topology.Molecule.from_mapped_smiles(smiles, allow_undefined_stereo=True)
symbols = [atom.symbol for atom in ffmol.atoms]
total_charge = sum(atom.formal_charge/openff.units.unit.elementary_charge for atom in ffmol.atoms)
for conformation in conformations:
molecule = Molecule(symbols=symbols, geometry=conformation.flatten(), molecular_charge=total_charge, canonical_isomeric_explicit_hydrogen_mapped_smiles=smiles)
dataset.add_entry(group, molecule)
dataset.submit(tag='spice-psi4-181')
Does that look generally correct? A first test will be to resubmit the DES monomers dataset with it. If everything is working correctly, QCArchive should recognize that all the samples are identical to ones that already exist and skip calculating anything. Assuming that works, the next test would be to force it to recompute them by adding a small offset to the positions and see if the results match what we got before.
I edited the above script with a few changes.
Checking this out by analogy to OFFMol.to_qcschema
, it looks like the two things this doesn't have are:
connectivity
, which is probably not important (I think was an early attempt to capture what we ended up being CMILES for but which didn't fly because it didn't have formal charge and we weren't sure if it should hold other things like stereo and aromaticity)multiplicity
, which you probably needFor how we set multiplicity, I'm not totally sure since I'm not a QM person. Some hints are:
OFFMol.to_qcschema
is 1So, I think for "reasonable" molecules we're always treating multiplicity as 1. For radicals other values may be necessary.
I think the CMILES may need to go somewhere else (maybe there's now an actual "cmiles" attribute, but I vaguely recall mention of a properties
dict or an identifiers
attribute).
To test this without risking problems on the central QCArchive, I'd recommend spinning up a QCFractal "snowflake" (a mini server living in a local process) with a really cheap psi4 method, and then making sure that the outputs look reasonable (especially that the CMILES makes it through and the atom ordering seems sane).
I've tried a few different methods of getting it to store the SMILES, including identifiers = Identifiers(canonical_isomeric_explicit_hydrogen_mapped_smiles=smiles)
and extras = {'canonical_isomeric_explicit_hydrogen_mapped_smiles': smiles}
. Neither one works. The property gets set on the Molecule object before I call add_entry()
. But when I query the dataset for either entries or records, it's not there anymore.
Using a Snowflake I can create the dataset, but no calculations get run. The status of the record just stays as 'waiting'. Do I need to do something else to make it run calculations?
@bennybp do you have any idea what I'm doing wrong that's causing the above problems?
In QCSubmit's tests, we use the fulltest_client
testing fixture from QCFractal, which makes a snowflake with some local compute workers. That method is distributed in the qcarchivetesting
conda package. You may be able to use that function verbatim to get a snowflake that can run a few local calculations.
I haven't looked too deeply, but my hunch is that the molecule already exists on the server without the identifiers, and add_entry
won't overwrite that. You can test this by translating the molecule by a little and see what happens.
If that is the case, you can keep the old molecule and change the identifiers afterwards with the PortalClient. (https://github.com/MolSSI/QCFractal/blob/e8d9cba50b1e59bf8ff85992cd9dc8f94158fe1b/qcportal/qcportal/client.py#L473).
For the snowflake issue, you might not have the QM package installed in the backend. Something like this conda env should work (assuming you are using psi4): https://github.com/MolSSI/QCFractal/blob/main/qcarchivetesting/conda-envs/fulltest_snowflake.yaml
As far as I can tell all the dependencies are there. Is there a way to get an error message saying why it isn't running anything?
my hunch is that the molecule already exists on the server without the identifiers
This is with a snowflake. There's nothing on the server.
Here's an error message:
Process ForkProcess-2:
Traceback (most recent call last):
File "/Users/peastman/miniconda3/envs/qcportal/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/Users/peastman/miniconda3/envs/qcportal/lib/python3.9/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/Users/peastman/miniconda3/envs/qcportal/lib/python3.9/site-packages/qcfractal/snowflake.py", line 60, in _compute_process
compute = ComputeManager(compute_config)
File "/Users/peastman/miniconda3/envs/qcportal/lib/python3.9/site-packages/qcfractalcompute/compute_manager.py", line 133, in __init__
self.app_manager = AppManager(self.manager_config)
File "/Users/peastman/miniconda3/envs/qcportal/lib/python3.9/site-packages/qcfractalcompute/apps/app_manager.py", line 108, in __init__
qcengine_functions = discover_programs_conda(None)
File "/Users/peastman/miniconda3/envs/qcportal/lib/python3.9/site-packages/qcfractalcompute/apps/app_manager.py", line 34, in discover_programs_conda
result = subprocess.check_output(cmd, universal_newlines=True, cwd=tmpdir)
File "/Users/peastman/miniconda3/envs/qcportal/lib/python3.9/subprocess.py", line 424, in check_output
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
File "/Users/peastman/miniconda3/envs/qcportal/lib/python3.9/subprocess.py", line 528, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['python3', '/Users/peastman/miniconda3/envs/qcportal/lib/python3.9/site-packages/qcfractalcompute/run_scripts/qcengine_list.py']' returned non-zero exit status 1.
Traceback (most recent call last):
File "/Users/peastman/miniconda3/envs/qcportal/lib/python3.9/site-packages/qcfractalcompute/run_scripts/qcengine_list.py", line 12, in <module>
progs = {x: qcengine.get_program(x).get_version() for x in qcengine.list_available_programs()}
File "/Users/peastman/miniconda3/envs/qcportal/lib/python3.9/site-packages/qcfractalcompute/run_scripts/qcengine_list.py", line 12, in <dictcomp>
progs = {x: qcengine.get_program(x).get_version() for x in qcengine.list_available_programs()}
File "/Users/peastman/miniconda3/envs/qcportal/lib/python3.9/site-packages/qcengine/programs/psi4.py", line 91, in get_version
self.version_cache[which_prog] = safe_version(exc["stdout"].split()[-1])
IndexError: list index out of range
The error happens while querying psi4. The code
for x in qcengine.list_available_programs():
print(x)
print(qcengine.get_program(x).get_version())
prints
rdkit
2023.3.1
xtb
20.2
openmm
8.0.0
psi4
Here's the psi4 version in my conda environment:
psi4 1.8.1 py39haabd4ea_2 conda-forge
what is psi4 --version
, please? could there be multiple psi4's around? and check if qcengine is at 0.28.1
I think the conda package for psi4 on Mac is broken. Just running psi4
produces a Python exception. I tried a different computer running Linux, and there it's able to correctly enumerate the programs.
Running on that computer, the status is briefly listed as running
and top
shows that psi4 is running. Then it stops and the status switches to error
. How can I find out what the error was? The documentation says, "The error and possibly the stdout/stderr properties may have more details about the error." But the record object has no attribute called error
, stdout
, or stderr
.
What is the correct way to store the SMILES? When I create the Molecule I specify extras = {'canonical_isomeric_explicit_hydrogen_mapped_smiles': smiles}
. If I then immediately print out molecule.extras
it prints
{'canonical_isomeric_explicit_hydrogen_mapped_smiles': '[Br:1][C:4]([Br:2])([Br:3])[H:5]'}
But when I query the record from the dataset, it lists extras=None
.
and check if qcengine is at 0.28.1
It's 0.27.0. Conda doesn't find anything newer.
It's because psi4 pins the version. If I try to force qcengine=0.28.1
I get
Encountered problems while solving:
- package psi4-1.8.1-py311hedf2024_2 requires qcengine >=0.27.0,<0.28.0a0, but none of the providers can be installed
As to why a record is waiting, I was inspired to make that a feature (which isn't available yet, but will be a nice addition: https://github.com/MolSSI/QCFractal/pull/759)
For the identifiers part: It seems to work for me. The identifiers are attached to the molecule:
from qcfractal.snowflake import FractalSnowflake
from qcportal.molecules import Molecule
s = FractalSnowflake()
c = s.client()
m = Molecule(symbols=['h', 'h'],
geometry=[0, 0, 0, 0, 0, 1],
identifiers={'canonical_isomeric_explicit_hydrogen_mapped_smiles': "abc123"}
)
ds = c.add_dataset('singlepoint', 'test dataset')
ds.add_entry('test_entry', molecule=m)
# Re-get the dataset
ds = c.get_dataset('singlepoint', 'test dataset')
entry = ds.get_entry('test_entry')
print(entry.molecule.identifiers)
ds.add_specification('test_spec', {'program': 'psi4', 'driver': 'energy', 'method': 'b3lyp', 'basis': '6-31g'})
ds.submit()
rec = ds.get_record('test_entry', 'test_spec')
print(rec.molecule.identifiers)
The value for canonical_isomeric_explicit_hydrogen_mapped_smiles
appears in both
You could add also them to the entry.attributes
as well, but I think the molecule
makes sense
Perhaps I was making things too complicated by trying to create an Identifiers object. I take it the correct usage is just to pass a dict?
I want to create the new datasets in a way that's consistent with the existing ones. I'm looking at the existing 'SPICE DES Monomers Single Points Dataset v1.1' dataset. For the molecules in that dataset, the canonical SMILES is not present in identifiers
:
Identifiers(molecule_hash='e11edc2979035fc70f58366ce13b6bb707adaf18', molecular_formula='C2H3N', smiles=None, inchi=None, inchikey=None, canonical_explicit_hydrogen_smiles=None, canonical_isomeric_explicit_hydrogen_mapped_smiles=None, canonical_isomeric_explicit_hydrogen_smiles=None, canonical_isomeric_smiles=None, canonical_smiles=None, pubchem_cid=None, pubchem_sid=None, pubchem_conformerid=None)
Instead it's in extras
:
{'canonical_isomeric_explicit_hydrogen_mapped_smiles': '[H:4]C:3([H:6])[C:2]#[N:1]'}
For consistency I think we should continue to put it in extras
. But we also should presumably put it in identifiers
, since that's now the recommended place for it?
I'm not clear on how to connect up records and entries. I try to loop over everything in the dataset like this:
for s in ds.specification_names:
for e in ds.iterate_entries():
print(s, e.name)
print(ds.get_record(e.name, s))
But get_record()
never finds anything:
spec_1 cc#n-3
None
spec_1 cnc-2
None
spec_1 nccco-16
None
Am I not calling it correctly? The API documentation says the first argument is the entry name and the second is the specification name.
It's because psi4 pins the version
of course it does: bad, self. I'll step up the v1.8.2 release that releases the pin. I haven't heard other reports of the mac psi4 being broken, though.
For consistency I think we should continue to put it in
extras
. But we also should presumably put it inidentifiers
, since that's now the recommended place for it?
Either is ok, and can can be added/modified later (although right now extras
on molecules are not modifiable, it is on my to-do list).
Am I not calling it correctly? The API documentation says the first argument is the entry name and the second is the specification name.
You are calling it correctly, but something is up with that dataset. There are no calculations submitted for spec_1
, only for spec_2
, spec_4
, and spec_6
.
r = ds.get_record('cc#n-3', 'spec_2')
print(r.id, r.status)
111567635 RecordStatusEnum.complete
This must be something to do with how specifications were created when the data was converted to the new format? There are actually six specifications. Here are their descriptions:
name='spec_1' specification=QCSpecification(program='psi4', driver=<SinglepointDriver.gradient: 'gradient'>, method='b3lyp', basis='dzvp', keywords={'maxiter': 200, 'scf_properties': ['dipole', 'quadrupole', 'wiberg_lowdin_indices', 'mayer_indices', 'mbis_charges']}, protocols=AtomicResultProtocols(wavefunction=<WavefunctionProtocolEnum.orbitals_and_eigenvalues: 'orbitals_and_eigenvalues'>, stdout=True, error_correction=ErrorCorrectionProtocol(default_policy=True, policies=None), native_files=<NativeFilesProtocolEnum.none: 'none'>)) description=''
name='spec_2' specification=QCSpecification(program='psi4', driver=<SinglepointDriver.gradient: 'gradient'>, method='b3lyp', basis='dzvp', keywords={'maxiter': 200, 'scf_properties': ['dipole', 'quadrupole', 'wiberg_lowdin_indices', 'mayer_indices', 'mbis_charges']}, protocols=AtomicResultProtocols(wavefunction=<WavefunctionProtocolEnum.none: 'none'>, stdout=True, error_correction=ErrorCorrectionProtocol(default_policy=True, policies=None), native_files=<NativeFilesProtocolEnum.none: 'none'>)) description=''
name='spec_3' specification=QCSpecification(program='psi4', driver=<SinglepointDriver.gradient: 'gradient'>, method='wb97m-d3bj', basis='def2-tzvppd', keywords={'maxiter': 200, 'wcombine': False, 'scf_properties': ['dipole', 'quadrupole', 'wiberg_lowdin_indices', 'mayer_indices', 'mbis_charges']}, protocols=AtomicResultProtocols(wavefunction=<WavefunctionProtocolEnum.none: 'none'>, stdout=True, error_correction=ErrorCorrectionProtocol(default_policy=True, policies=None), native_files=<NativeFilesProtocolEnum.none: 'none'>)) description=''
name='spec_4' specification=QCSpecification(program='psi4', driver=<SinglepointDriver.gradient: 'gradient'>, method='wb97m-d3bj', basis='def2-tzvppd', keywords={'maxiter': 200, 'wcombine': False, 'scf_properties': ['dipole', 'quadrupole', 'wiberg_lowdin_indices', 'mayer_indices', 'mbis_charges']}, protocols=AtomicResultProtocols(wavefunction=<WavefunctionProtocolEnum.orbitals_and_eigenvalues: 'orbitals_and_eigenvalues'>, stdout=True, error_correction=ErrorCorrectionProtocol(default_policy=True, policies=None), native_files=<NativeFilesProtocolEnum.none: 'none'>)) description=''
name='spec_5' specification=QCSpecification(program='dftd3', driver=<SinglepointDriver.gradient: 'gradient'>, method='b3lyp-d3bj', basis=None, keywords={}, protocols=AtomicResultProtocols(wavefunction=<WavefunctionProtocolEnum.orbitals_and_eigenvalues: '`'>, stdout=True, error_correction=ErrorCorrectionProtocol(default_policy=True, policies=None), native_files=<NativeFilesProtocolEnum.none: 'none'>)) description=''
name='spec_6' specification=QCSpecification(program='dftd3', driver=<SinglepointDriver.gradient: 'gradient'>, method='b3lyp-d3bj', basis=None, keywords={}, protocols=AtomicResultProtocols(wavefunction=<WavefunctionProtocolEnum.none: 'none'>, stdout=True, error_correction=ErrorCorrectionProtocol(default_policy=True, policies=None), native_files=<NativeFilesProtocolEnum.none: 'none'>)) description=''
They come in pairs that are identical except for the value of wavefunction
. One (which has no records) has it set to orbitals_and_eigenvalues
, while the other (which has records for all samples), has it set to none
.
spec_2
and spec_4
are the OpenFF and SPICE levels of theory, respectively. Those were the only two specifications I expected to be present. I have no idea where spec_6
came from. It has program='dftd3'
???
@loriab here is the exception I get when running psi4 on the Mac. I assume it's unrelated to the problem I'm seeing on Linux, where the record status is reported as error
with no error message.
Traceback (most recent call last):
File "/Users/peastman/miniconda3/envs/qcportal/bin/psi4", line 213, in <module>
import psi4 # isort:skip
File "/Users/peastman/miniconda3/envs/qcportal/lib/python3.9/site-packages/psi4/__init__.py", line 90, in <module>
from .driver import endorsed_plugins
File "/Users/peastman/miniconda3/envs/qcportal/lib/python3.9/site-packages/psi4/driver/__init__.py", line 56, in <module>
from psi4.driver import gaussian_n
File "/Users/peastman/miniconda3/envs/qcportal/lib/python3.9/site-packages/psi4/driver/gaussian_n.py", line 31, in <module>
from psi4.driver import driver
File "/Users/peastman/miniconda3/envs/qcportal/lib/python3.9/site-packages/psi4/driver/driver.py", line 49, in <module>
from psi4.driver import driver_nbody
File "/Users/peastman/miniconda3/envs/qcportal/lib/python3.9/site-packages/psi4/driver/driver_nbody.py", line 830, in <module>
class ManyBodyComputer(BaseComputer):
File "pydantic/main.py", line 204, in pydantic.main.ModelMetaclass.__new__
File "pydantic/fields.py", line 488, in pydantic.fields.ModelField.infer
File "pydantic/fields.py", line 419, in pydantic.fields.ModelField.__init__
File "pydantic/fields.py", line 534, in pydantic.fields.ModelField.prepare
File "pydantic/fields.py", line 728, in pydantic.fields.ModelField._type_analysis
File "pydantic/fields.py", line 778, in pydantic.fields.ModelField._create_sub_type
File "pydantic/fields.py", line 419, in pydantic.fields.ModelField.__init__
File "pydantic/fields.py", line 534, in pydantic.fields.ModelField.prepare
File "pydantic/fields.py", line 728, in pydantic.fields.ModelField._type_analysis
File "pydantic/fields.py", line 778, in pydantic.fields.ModelField._create_sub_type
File "pydantic/fields.py", line 419, in pydantic.fields.ModelField.__init__
File "pydantic/fields.py", line 534, in pydantic.fields.ModelField.prepare
File "pydantic/fields.py", line 633, in pydantic.fields.ModelField._type_analysis
File "pydantic/fields.py", line 778, in pydantic.fields.ModelField._create_sub_type
File "pydantic/fields.py", line 419, in pydantic.fields.ModelField.__init__
File "pydantic/fields.py", line 534, in pydantic.fields.ModelField.prepare
File "pydantic/fields.py", line 638, in pydantic.fields.ModelField._type_analysis
File "/Users/peastman/miniconda3/envs/qcportal/lib/python3.9/typing.py", line 851, in __subclasscheck__
return issubclass(cls, self.__origin__)
TypeError: issubclass() arg 1 must be a class
This must be something to do with how specifications were created when the data was converted to the new format? There are actually six specifications. Here are their descriptions:
name='spec_1' specification=QCSpecification(program='psi4', driver=<SinglepointDriver.gradient: 'gradient'>, method='b3lyp', basis='dzvp', keywords={'maxiter': 200, 'scf_properties': ['dipole', 'quadrupole', 'wiberg_lowdin_indices', 'mayer_indices', 'mbis_charges']}, protocols=AtomicResultProtocols(wavefunction=<WavefunctionProtocolEnum.orbitals_and_eigenvalues: 'orbitals_and_eigenvalues'>, stdout=True, error_correction=ErrorCorrectionProtocol(default_policy=True, policies=None), native_files=<NativeFilesProtocolEnum.none: 'none'>)) description='' name='spec_2' specification=QCSpecification(program='psi4', driver=<SinglepointDriver.gradient: 'gradient'>, method='b3lyp', basis='dzvp', keywords={'maxiter': 200, 'scf_properties': ['dipole', 'quadrupole', 'wiberg_lowdin_indices', 'mayer_indices', 'mbis_charges']}, protocols=AtomicResultProtocols(wavefunction=<WavefunctionProtocolEnum.none: 'none'>, stdout=True, error_correction=ErrorCorrectionProtocol(default_policy=True, policies=None), native_files=<NativeFilesProtocolEnum.none: 'none'>)) description='' name='spec_3' specification=QCSpecification(program='psi4', driver=<SinglepointDriver.gradient: 'gradient'>, method='wb97m-d3bj', basis='def2-tzvppd', keywords={'maxiter': 200, 'wcombine': False, 'scf_properties': ['dipole', 'quadrupole', 'wiberg_lowdin_indices', 'mayer_indices', 'mbis_charges']}, protocols=AtomicResultProtocols(wavefunction=<WavefunctionProtocolEnum.none: 'none'>, stdout=True, error_correction=ErrorCorrectionProtocol(default_policy=True, policies=None), native_files=<NativeFilesProtocolEnum.none: 'none'>)) description='' name='spec_4' specification=QCSpecification(program='psi4', driver=<SinglepointDriver.gradient: 'gradient'>, method='wb97m-d3bj', basis='def2-tzvppd', keywords={'maxiter': 200, 'wcombine': False, 'scf_properties': ['dipole', 'quadrupole', 'wiberg_lowdin_indices', 'mayer_indices', 'mbis_charges']}, protocols=AtomicResultProtocols(wavefunction=<WavefunctionProtocolEnum.orbitals_and_eigenvalues: 'orbitals_and_eigenvalues'>, stdout=True, error_correction=ErrorCorrectionProtocol(default_policy=True, policies=None), native_files=<NativeFilesProtocolEnum.none: 'none'>)) description='' name='spec_5' specification=QCSpecification(program='dftd3', driver=<SinglepointDriver.gradient: 'gradient'>, method='b3lyp-d3bj', basis=None, keywords={}, protocols=AtomicResultProtocols(wavefunction=<WavefunctionProtocolEnum.orbitals_and_eigenvalues: '`'>, stdout=True, error_correction=ErrorCorrectionProtocol(default_policy=True, policies=None), native_files=<NativeFilesProtocolEnum.none: 'none'>)) description='' name='spec_6' specification=QCSpecification(program='dftd3', driver=<SinglepointDriver.gradient: 'gradient'>, method='b3lyp-d3bj', basis=None, keywords={}, protocols=AtomicResultProtocols(wavefunction=<WavefunctionProtocolEnum.none: 'none'>, stdout=True, error_correction=ErrorCorrectionProtocol(default_policy=True, policies=None), native_files=<NativeFilesProtocolEnum.none: 'none'>)) description=''
They come in pairs that are identical except for the value of
wavefunction
. One (which has no records) has it set toorbitals_and_eigenvalues
, while the other (which has records for all samples), has it set tonone
.
@peastman it's not clear to me where you're seeing this. Can you show us what you are running that gives these specs?
I recall that when we first ran SPICE, we used specs that preserved wavefunctions ('orbitals_and_eigenvalues'
), and this ended up resulting in a massive amount of data filling up the old server. We later chose to run the same specs without preserving wavefunctions (None
), and those calculations we kept.
I generated that output with this code:
client = PortalClient('https://ml.qcarchive.molssi.org')
ds = client.get_dataset('singlepoint', 'SPICE DES Monomers Single Points Dataset v1.1')
for s in ds.specifications:
print(ds.specifications[s])
print()
@bennybp I have my script working when using a snowflake. Now I'm trying to test it on the real server by resubmitting the DES monomers dataset. That's one of the smaller ones: 374 molecules, all of them very small, 18,700 conformations total. It initially seems to be working, but creating the entries is really slow, about five per second. At that rate, the larger datasets would take over a day to submit. But after the first 100 molecules (about 20 minutes), it crashes with an exception. I tried twice and got the same exception both times.
Traceback (most recent call last):
File "/Users/peastman/miniconda3/envs/qcportal/lib/python3.9/site-packages/urllib3/connectionpool.py", line 449, in _make_request
six.raise_from(e, None)
File "<string>", line 3, in raise_from
File "/Users/peastman/miniconda3/envs/qcportal/lib/python3.9/site-packages/urllib3/connectionpool.py", line 444, in _make_request
httplib_response = conn.getresponse()
File "/Users/peastman/miniconda3/envs/qcportal/lib/python3.9/http/client.py", line 1377, in getresponse
response.begin()
File "/Users/peastman/miniconda3/envs/qcportal/lib/python3.9/http/client.py", line 320, in begin
version, status, reason = self._read_status()
File "/Users/peastman/miniconda3/envs/qcportal/lib/python3.9/http/client.py", line 281, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
File "/Users/peastman/miniconda3/envs/qcportal/lib/python3.9/socket.py", line 704, in readinto
return self._sock.recv_into(b)
File "/Users/peastman/miniconda3/envs/qcportal/lib/python3.9/ssl.py", line 1241, in recv_into
return self.read(nbytes, buffer)
File "/Users/peastman/miniconda3/envs/qcportal/lib/python3.9/ssl.py", line 1099, in read
return self._sslobj.read(len, buffer)
socket.timeout: The read operation timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/peastman/miniconda3/envs/qcportal/lib/python3.9/site-packages/requests/adapters.py", line 440, in send
resp = conn.urlopen(
File "/Users/peastman/miniconda3/envs/qcportal/lib/python3.9/site-packages/urllib3/connectionpool.py", line 785, in urlopen
retries = retries.increment(
File "/Users/peastman/miniconda3/envs/qcportal/lib/python3.9/site-packages/urllib3/util/retry.py", line 550, in increment
raise six.reraise(type(error), error, _stacktrace)
File "/Users/peastman/miniconda3/envs/qcportal/lib/python3.9/site-packages/urllib3/packages/six.py", line 770, in reraise
raise value
File "/Users/peastman/miniconda3/envs/qcportal/lib/python3.9/site-packages/urllib3/connectionpool.py", line 703, in urlopen
httplib_response = self._make_request(
File "/Users/peastman/miniconda3/envs/qcportal/lib/python3.9/site-packages/urllib3/connectionpool.py", line 451, in _make_request
self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
File "/Users/peastman/miniconda3/envs/qcportal/lib/python3.9/site-packages/urllib3/connectionpool.py", line 340, in _raise_timeout
raise ReadTimeoutError(
urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='ml.qcarchive.molssi.org', port=443): Read timed out. (read timeout=60)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/peastman/workspace/spice-dataset/submission/submit.py", line 33, in <module>
dataset.add_entry(f'{group}-{i}', molecule)
File "/Users/peastman/miniconda3/envs/qcportal/lib/python3.9/site-packages/qcportal/singlepoint/dataset_models.py", line 120, in add_entry
return self.add_entries(ent)
File "/Users/peastman/miniconda3/envs/qcportal/lib/python3.9/site-packages/qcportal/singlepoint/dataset_models.py", line 88, in add_entries
ret = self._client.make_request(
File "/Users/peastman/miniconda3/envs/qcportal/lib/python3.9/site-packages/qcportal/client_base.py", line 358, in make_request
r = self._request(method, endpoint, body=serialized_body, url_params=parsed_url_params)
File "/Users/peastman/miniconda3/envs/qcportal/lib/python3.9/site-packages/qcportal/client_base.py", line 297, in _request
r = self._req_session.send(prep_req, verify=self._verify, timeout=self._timeout)
File "/Users/peastman/miniconda3/envs/qcportal/lib/python3.9/site-packages/requests/sessions.py", line 645, in send
r = adapter.send(request, **kwargs)
File "/Users/peastman/miniconda3/envs/qcportal/lib/python3.9/site-packages/requests/adapters.py", line 532, in send
raise ReadTimeout(e, request=request)
requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='ml.qcarchive.molssi.org', port=443): Read timed out. (read timeout=60)
I do see similar behavior (although not quite as bad on my end). I think there's some inefficiency in the add_entry
code on the server that only shows up with larger datasets. I am investigating.
An alternative which will likely be much faster is to use the bulk add_entries
function. It's a little clunky, but let me see if I can polish it up quick.
Thanks! I didn't realize there was a bulk version. I'll try that.
The above behavior was running the script at home. I managed to work around it by running it on a cluster with a faster internet connection. It was still slow, but at least it didn't time out.
When I resubmitted the DES monomers dataset, it didn't recognize any of the records as duplicates of existing ones. I let it rerun the whole dataset, and the results agree well with the existing ones.
Much better! It successfully submitted the whole dataset in just over a minute. And it recognized all the records as duplicates of the ones computed yesterday.
The script is in #85. I think this means we can finally start running calculations!
When I tried to submit a larger dataset (the PubChem boron silicon set, 174,450 conformations total) it still failed even with the bulk creation. Running from home it fails with the error
Traceback (most recent call last):
File "/Users/peastman/workspace/spice-dataset/submission/submit.py", line 34, in <module>
dataset.add_entries(entries)
File "/Users/peastman/miniconda3/envs/qcportal/lib/python3.9/site-packages/qcportal/singlepoint/dataset_models.py", line 88, in add_entries
ret = self._client.make_request(
File "/Users/peastman/miniconda3/envs/qcportal/lib/python3.9/site-packages/qcportal/client_base.py", line 358, in make_request
r = self._request(method, endpoint, body=serialized_body, url_params=parsed_url_params)
File "/Users/peastman/miniconda3/envs/qcportal/lib/python3.9/site-packages/qcportal/client_base.py", line 311, in _request
return self._request(method, endpoint, body=body, url_params=url_params, retry=False)
File "/Users/peastman/miniconda3/envs/qcportal/lib/python3.9/site-packages/qcportal/client_base.py", line 323, in _request
raise PortalRequestError(f"Request failed: {details['msg']}", r.status_code, details)
qcportal.client_base.PortalRequestError: Request failed: Token has expired (HTTP status 401)
Running on the cluster it gets a different error:
Traceback (most recent call last):
File "/home/users/peastman/miniconda3/envs/qcfractalcompute/lib/python3.11/site-packages/urllib3/connectionpool.py", line 536, in _make_request
response = conn.getresponse()
^^^^^^^^^^^^^^^^^^
File "/home/users/peastman/miniconda3/envs/qcfractalcompute/lib/python3.11/site-packages/urllib3/connection.py", line 461, in getresponse
httplib_response = super().getresponse()
^^^^^^^^^^^^^^^^^^^^^
File "/home/users/peastman/miniconda3/envs/qcfractalcompute/lib/python3.11/http/client.py", line 1378, in getresponse
response.begin()
File "/home/users/peastman/miniconda3/envs/qcfractalcompute/lib/python3.11/http/client.py", line 318, in begin
version, status, reason = self._read_status()
^^^^^^^^^^^^^^^^^^^
File "/home/users/peastman/miniconda3/envs/qcfractalcompute/lib/python3.11/http/client.py", line 279, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/users/peastman/miniconda3/envs/qcfractalcompute/lib/python3.11/socket.py", line 706, in readinto
return self._sock.recv_into(b)
^^^^^^^^^^^^^^^^^^^^^^^
File "/home/users/peastman/miniconda3/envs/qcfractalcompute/lib/python3.11/ssl.py", line 1311, in recv_into
return self.read(nbytes, buffer)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/users/peastman/miniconda3/envs/qcfractalcompute/lib/python3.11/ssl.py", line 1167, in read
return self._sslobj.read(len, buffer)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TimeoutError: The read operation timed out
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/users/peastman/miniconda3/envs/qcfractalcompute/lib/python3.11/site-packages/requests/adapters.py", line 486, in send
resp = conn.urlopen(
^^^^^^^^^^^^^
File "/home/users/peastman/miniconda3/envs/qcfractalcompute/lib/python3.11/site-packages/urllib3/connectionpool.py", line 844, in urlopen
retries = retries.increment(
^^^^^^^^^^^^^^^^^^
File "/home/users/peastman/miniconda3/envs/qcfractalcompute/lib/python3.11/site-packages/urllib3/util/retry.py", line 470, in increment
raise reraise(type(error), error, _stacktrace)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/users/peastman/miniconda3/envs/qcfractalcompute/lib/python3.11/site-packages/urllib3/util/util.py", line 39, in reraise
raise value
File "/home/users/peastman/miniconda3/envs/qcfractalcompute/lib/python3.11/site-packages/urllib3/connectionpool.py", line 790, in urlopen
response = self._make_request(
^^^^^^^^^^^^^^^^^^^
File "/home/users/peastman/miniconda3/envs/qcfractalcompute/lib/python3.11/site-packages/urllib3/connectionpool.py", line 538, in _make_request
self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
File "/home/users/peastman/miniconda3/envs/qcfractalcompute/lib/python3.11/site-packages/urllib3/connectionpool.py", line 370, in _raise_timeout
raise ReadTimeoutError(
urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='ml.qcarchive.molssi.org', port=443): Read timed out. (read timeout=60)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/groups/tem26/peastman/workspace/spice-dataset/submission/submit.py", line 34, in <module>
dataset.add_entries(entries)
File "/home/users/peastman/miniconda3/envs/qcfractalcompute/lib/python3.11/site-packages/qcportal/singlepoint/dataset_models.py", line 88, in add_entries
ret = self._client.make_request(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/users/peastman/miniconda3/envs/qcfractalcompute/lib/python3.11/site-packages/qcportal/client_base.py", line 358, in make_request
r = self._request(method, endpoint, body=serialized_body, url_params=parsed_url_params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/users/peastman/miniconda3/envs/qcfractalcompute/lib/python3.11/site-packages/qcportal/client_base.py", line 297, in _request
r = self._req_session.send(prep_req, verify=self._verify, timeout=self._timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/users/peastman/miniconda3/envs/qcfractalcompute/lib/python3.11/site-packages/requests/sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/users/peastman/miniconda3/envs/qcfractalcompute/lib/python3.11/site-packages/requests/adapters.py", line 532, in send
raise ReadTimeout(e, request=request)
requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='ml.qcarchive.molssi.org', port=443): Read timed out. (read timeout=60)
I tried twice on each computer and got the same result both times.
I think it's because qcportal hardcodes the timeout to 60 seconds:
The server is taking longer than 60 seconds to reply, causing it to fail.
There is a time limit on the client, but that is changeable. I should make it not a private variable, but you can set client._timeout = 120
. The server also has a timeout, though, which is around 1-2 minutes, so it might not help.
I did make some changes to the server to make it faster, but that needs a new release (tentatively later this week). I will also add automatic batching to add_entries
with a progress bar, which would hopefully stop this particular error once in for all. ds.submit
is another story, though.
@bennybp: What's the plan to deal with the server timeout? Will that be substantially increased? Can calculations be split into separate requests that append?
Even if the client timeout is increased and the server is sped up a bit, it seems we are still ultimately going to run into that timeout.
I tried setting client._timeout = 120
, but I still got the same error.
I tried again, this time setting client._timeout = None
to completely disable the client side timeout. Still the same result.
The real problem may be something different. The error message is Token has expired (HTTP status 401)
. 401 is an authentication failure. I see that _request()
calls _refresh_JWT_token()
to try to automatically renew the token. I gather that isn't working. I don't know how the token lifetime is set.
@bennybp: What's the plan to deal with the server timeout? Will that be substantially increased? Can calculations be split into separate requests that append?
Two pieces to the plan. Short term is to do batching client side. This can be done already. This is from memory:
from qcportal.utils import chunk_iterable
for entry_batch in chunk_iterable(new_entries, 1000):
ds.add_entries(entry_batch)
I want to implement this automatically in the client, with progress bars.
Long term we could exploit the server-side job queue for this, but that will take some time.
Thanks, that worked!
The scripts are in #85.
We need to figure out how we're going to submit calculations for SPICE 2. For version 1 we used the https://github.com/openforcefield/qca-dataset-submission repository. It uses github automation, so that you submit calculations by creating pull requests. Status updates are automatically posted to the PR, and it handles error cycling. The automation scripts use QCSubmit, which is a layer on top of the QCPortal API to provide additional useful features.
Due to a major redesign of QCPortal, QCSubmit is currently not functional. It's being updated, but there isn't a firm date for when the new version will be ready. Also, the new version of QCPortal provides many of the features it was created to add. According to @j-wags, QCSubmit still provides useful features for optimization datasets. For single point datasets, which is what we're using, the benefits are much smaller.
So we have a few questions to decide.
It looks to me like creating submissions should be about equally simple either way. Having status updates automatically posted to github is convenient but probably not that important. It's easy to write a script that you run to check the status. Error cycling might be more complicated, although I'm hoping it may no longer be as important. Newer versions of psi4 seem to be a lot better about not producing the sort of intermittent errors it's needed for. And while generating v1, we wasted a lot of computation time on error cycling, just repeating the same calculations over and over that kept failing.
Another benefit of the github approach is that it provides a record of the submission, but that too may be unnecessary for our purposes. All our submission will be the same type of dataset, and they'll all use exactly the same level of theory and other settings. I believe the only inputs needed for the submission will be the dataset name, and the HDF5 file containing the conformations (which is already in the repository).
cc @dotsdl