PDBeurope / protein-cluster-conformers

Clusters protein chains based on CA distance difference
https://pdbeurope.github.io/protein-cluster-conformers/
Other
9 stars 0 forks source link

Error when running benchmarking examples #8

Open willvfried opened 3 weeks ago

willvfried commented 3 weeks ago

Hi, I've downloaded the repo and tried running the benchmarking examples, but when attempting any of the commands in the README or just ./examples/run_O34926.sh I get the following error: 08-19 16:21:01.874 cluster_conformers.cluster_monomers INFO Loading mmCIF files for O34926 Traceback (most recent call last): File "/Users/will/Downloads/protein-cluster-conformers-main/find_conformers.py", line 289, in main() File "/Users/will/Downloads/protein-cluster-conformers-main/find_conformers.py", line 238, in main unp_cluster.ca_distance(args.path_ca) File "/Users/will/Downloads/protein-cluster-conformers-main/cluster_conformers/cluster_monomers.py", line 281, in ca_distance self.path_save_unps.mkdir(exist_ok=True) File "/Users/will/miniconda3/envs/alphaflow/lib/python3.9/pathlib.py", line 1323, in mkdir self._accessor.mkdir(self, mode) FileNotFoundError: [Errno 2] No such file or directory: 'benchmark_data/examples/O34926/O34926_ca_distances/unp_residue_ids'

willvfried commented 3 weeks ago

In the script 'cluster_monomers.py' it looks like 'unp_residue_ids' and 'ca_matxs' aren't being defined properly (everything within the brackets is commented out):

self.ca_matxs = { # CA distance matrices. Ordered

"1atp_A" : path (as pathlib.PosixPath) to serilised np.ndarray(...) file,

        # "2adp_B" : path (as pathlib.PosixPath) to serilised np.ndarray(...) file,
        # ...
    }

    self.unp_res_ids = {
        # "1atp_A" : path (as pathlib.PosixPath) to serilised np.array(),
        # "2adp_B" : path (as pathlib.PosixPath) to serilised np.array(),
        # ...
    }
Joseph-Ellaway commented 6 days ago

Hi @willvfried, thanks for opening the issue. Which directory on your local machine are you running the scripts from?

Joseph-Ellaway commented 6 days ago

Hi @willvfried,

I've opened a new branch -- PDBE-7222 -- to address the problems above. cluster_benchmark.py still needs fixing but I'm just waiting on a response from our team regarding some data in one of the updated_mmCIF files. Many thanks!

Joseph-Ellaway commented 6 days ago

In the script 'cluster_monomers.py' it looks like 'unp_residue_ids' and 'ca_matxs' aren't being defined properly (everything within the brackets is commented out):

self.ca_matxs = { # CA distance matrices. Ordered # "1atp_A" : path (as pathlib.PosixPath) to serilised np.ndarray(...) file, # "2adp_B" : path (as pathlib.PosixPath) to serilised np.ndarray(...) file, # ... }

    self.unp_res_ids = {
        # "1atp_A" : path (as pathlib.PosixPath) to serilised np.array(),
        # "2adp_B" : path (as pathlib.PosixPath) to serilised np.array(),
        # ...
    }

These two dictionaries become populated later in the method. The comments describe the key-vals added later on in lines 298-299.