rdkit / mmpdb

A package to identify matched molecular pairs and use them to predict property changes.
Other
197 stars 55 forks source link

Error when using "--out mmpa" #46

Closed baoilleach closed 2 years ago

baoilleach commented 2 years ago

I'm not sure what "--output mmpa" does but it gives the following error for the GitHub version:

$ mmpdb index myfile.fragments -o myfile.mmpa --out mmpa
...
  File "mmpdb/mmpdblib/index_writers.py", line 106, in add_environment_fingerprint_parent
    self._W("FINGERPRINT\t%d\t%s\n" % (fp_idx, environment_fingerprint, parent_idx))
TypeError: not all arguments converted during string formatting
adalke commented 2 years ago

It's a legacy text format that should probably be removed.

Looks like it came from when the environment fingerprint table grew to include links from r+1 to r. It shows me no one uses that format.

Could you try using https://github.com/adalke/mmpdb/tree/v3-dev ? That's the latest dev version, mostly waiting for feedback before final release.

The current version of that section is:

    def add_environment_fingerprint(self, fp_idx, smarts, pseudosmiles, parent_smarts):
        if parent_smarts is None:
            parent_smarts = ""
        self._W("FINGERPRINT\t%d\t%s\t%s\t%s\n" % (fp_idx, smarts,  pseudosmiles, parent_smarts))
baoilleach commented 2 years ago

That works, but with the new version --out csv doesn't work (which I do need).

  File "/home/export/noel.oboyle/Tools/conda/miniconda3/envs/mmpdb-test/lib/python3.9/site-packages/mmpdblib/index_writers.py", line 1174, in open
    return self.opener(destination, compression,
  File "/home/export/noel.oboyle/Tools/conda/miniconda3/envs/mmpdb-test/lib/python3.9/site-packages/mmpdblib/index_writers.py", line 849, in _open_csv
    outfile = _open_output(destination, compressionf)
NameError: name 'compressionf' is not defined
adalke commented 2 years ago

Okay, fixed the typo. Seems to work.

You might also be interested in the "csvd" output format, which saves all of the tables to separate CSV files in a named directory, along with drivers to import the data into SQLite and Postgres. ("csvd" = "csv directory")

baoilleach commented 2 years ago

Works for me now. I note that cvsd is not (yet?) documented under "mmpdb index --help". Sounds interesting, but not for me right now.