Closed kjappelbaum closed 10 months ago
remaining ran through
Still waiting for rdkit
in the SMILES split, but otherwise this seems to run
one we need to check a bit more carefully is odd_one_out
.
Since I need to redownload, I'll call it a day now while it continues running
odd one out also ran successfully, Iupac names is another large and slow dataset
Updates:
try
/except
to not chunk those datasets (this would probably also have caused issues with the chunked CSV via pandas setup)data_clean.csv
when runningtransform.py
meta.yaml
promisedpartitions
try/except
it pythonicallyAlso datasets that are more difficult to parse now do something meaningful
zinc has been causing issues for MicPie, also works
the
LocalCluster
from one of the commits here is not strictly needed, but it helps with debugging.couldn't test on HPC as I still get Quota issues after deleting many files.
need to verify that we have all dependencies: openpyxl, pymatgen, givemeconformer, dask