Closed salomonssonj closed 2 months ago
Hi Johannes,
Thank you for reporting. Indeed, it seems like an issue in AlphaPulldown and not the pipeline. I assigned @DimaMolod to your issue because he implemented convert_to_modelcif.py in AlphaPulldown and can best address this.
Regarding snakemake, the /dev/null after the input is an implementation detail to enable job clustering, but it has no effect here. You should also not see a difference in prediction scores between using the snakemake pipeline and AlphaPulldown, because the prediction procedure is the same. Have you compared the output of the pipeline to the most recent version of AlphaPulldown from the main branch or a previous version? If prediction scores from the main branch AP version deviate strongly from previous AP versions, it would be best to open an issue here: https://github.com/KosinskiLab/AlphaPulldown/issues.
Best, Valentin
Hi Valentin,
Thank you for your input.
For some input combinations, I also get errors related to template_confidence_scores (see attachment). However, I have manage to successfully obtain predictions using the same features. For example, I get the error when my input is "O75112-6+Q15124:1-195" (as in the attached log file), but for "O75112-6:1-84+Q14315:2036-2406" it works.
Kindly, Johannes
Hi Valentin,
Thank you for your input.
For some input combinations, I also get errors related to template_confidence_scores (see attachment). However, I have manage to successfully obtain predictions using the same features. For example, I get the error when my input is "O75112-6+Q15124:1-195" (as in the attached log file), but for "O75112-6:1-84+Q14315:2036-2406" it works.
Kindly, Johannes
Hi @salomonssonj
This KeyError: 'template_confidence_scores' indicated that you might have created one of the feature pickles using mmseqs2 under older version of AlphPulldown? Previously, it would cause this error when mmseqs2 had some problems finding the structural templates and it has been fixed in later AlphaPulldown versions. If so, could you remove the corresponding pickles rerun the feature creation steps, using the newer versions.
Yours Dingquan
Hi Dingquan,
Thank you for your response.
I generated these features at the beginning this week using the snakemake pipeline. I re-generated them yesterday, also using the snakemake pipeline, in case something went wrong the first time. I have attached the log-file for one of the features I generated at the beginning of this week.
I cleared the AlphaPulldownSnakemake/.snakemake/singularity directory last friday to make sure I had the latest singularity images, in case that is helpful.
Kindly, Johannes
Hi Johannes,
Thanks for the updates. It's really strange that your pickle still doesn't have template_confidence_scores, and the current version of AlphaPulldown makes sure every pickle should contain this value, as in here: https://github.com/KosinskiLab/AlphaPulldown/blob/fe456e610f337838fff820d889293a2ead99ef14/alphapulldown/objects.py#L155-L163 I really cannot think of a solution except manually reading in the pickle that caused you this problem. Then, within the feat_dict attribute, you manually add the template_confidence_scores to be [1]*num_of_residues and save the pickle.
Yours Dingquan
Hi Johannes, could you send Dingquan your pkl files for an example job that crashed to Dingquan? @dingquanyu just to double check the template_confidence_scores are not there.
Hi Dingquan and Jan,
I have attached one of my .pkl files. Q15124.pkl.zip
Thank you for your help.
Kindly, Johannes
Could you also send the second from the pair?
Yes, sorry. There is the second pkl file
I think both pkl files do have 'template_confidence_scores' and 'template_release_date', so the problem must be somewhere else. I will try to reproduce the error using the provided pkl files
I managed to reproduce the error. The problem occurs only for the ChoppedObjects because the new keys 'template_confidence_scores' and 'template_release_date' are lost after this function is called: https://github.com/KosinskiLab/AlphaPulldown/blob/fe456e610f337838fff820d889293a2ead99ef14/alphapulldown/objects.py#L323-L354
This issue should be fixed in the new version. @salomonssonj please update the images and let us know if it works for you now. Thanks again for reporting this; that was a big and well-hidden bug!
Thank you for looking in to it! I'll let you know when I have tried with the updates images.
Hi,
I tried to run some predictions this morning with the updates images but it once again failed. 6618594.txt
Hi,
I tried to run some predictions this morning with the updates images but it once again failed. 6618594.txt
Hi,
This error was caused by using jax version higher or equal to 0.4.24. In jax version 0.4.23, jax has this module but in the later versions, they deprecated it. However, in AlphaPulldown's dockerfile, jax0.4.23 is specified, meaning it should be fine. Could you check the jax version inside your container? @DimaMolod @maurerv if you have time, could you check the jax version inside the container as well?
Yours Dingquan
Hi,
Thank you, inside the singularity image b1a0408b77e6fc0b904c69cd981fb35c.simg I have the jax0.4.30 version, and in 3f20617ccba864758b2a437ef2fde35c.simg 0.4.16.
I see. Could you try again using the image with 0.4.16 version?
I think these two images correspond to the pulldown.docker and analysis.docker. At the same time pulldown.docker has version 0.4.23 specified: https://github.com/KosinskiLab/AlphaPulldown/blob/main/docker/pulldown.dockerfile#L77-L78 and for analysis image, is it 0.4.16? https://github.com/KosinskiLab/AlphaPulldown/blob/main/alphapulldown/analysis_pipeline/Dockerfile#L35 Something is wrong with the images definitions
Yes, the image with jax 0.4.16 is from kosinskilab/fold_analysis:latest
thank you, @salomonssonj, we are working on that issue and let you know once it's fixed!
Great, thank you very much!
Hi, I tried to run some predictions this morning with the updates images but it once again failed. 6618594.txt
Hi,
This error was caused by using jax version higher or equal to 0.4.24. In jax version 0.4.23, jax has this module but in the later versions, they deprecated it. However, in AlphaPulldown's dockerfile, jax0.4.23 is specified, meaning it should be fine. Could you check the jax version inside your container? @DimaMolod @maurerv if you have time, could you check the jax version inside the container as well?
Yours Dingquan
Hi @salomonssonj
I checked the current version of the docker image from the hub and the jax version now is correct. I think @DimaMolod had already tried the image to model the structures of the given pickles and the key error was solved. Could you try again pls?
Yours Dingquan
For me, features and predictions are created, but the reports crash. I think 'compute_stats' rule always fails due to this error:
rule compute_stats:
input: /scratch/dima/fold_temp/predictions/Q8I2G6_Q8I5K4/completed_fold.txt
output: /scratch/dima/fold_temp/predictions/Q8I2G6_Q8I5K4/statistics.csv
jobid: 0
reason: Forced execution
wildcards: fold=Q8I2G6_Q8I5K4
resources: mem_mb=8000, mem_mib=7630, disk_mb=1000, disk_mib=954, tmpdir=/scratch/jobs/6960313, walltime=1440, attempt=1
Activating singularity image /g/kosinski/dima/SnakeMake/AlphaPulldownSnakemake/.snakemake/singularity/3f20617ccba864758b2a437ef2fde35c.simg
WARNING: Could not find any nv files on this host!
I0713 09:26:31.047352 140737350492992 get_good_inter_pae.py:120] now processing Q8I2G6_Q8I5K4
E0713 09:26:31.510183 140737350492992 get_good_inter_pae.py:156] Error processing PAE and iPTM for job Q8I2G6_Q8I5K4: No module named 'alphafold'
I0713 09:26:31.510915 140737350492992 get_good_inter_pae.py:166] done for Q8I2G6_Q8I5K4 1 out of 1 finished.
I0713 09:26:31.510968 140737350492992 get_good_inter_pae.py:169] Unfortunately, none of your protein models had at least one PAE on the interface below your cu
toff value : 100.0.
Please consider using a larger cutoff.
[Sat Jul 13 09:26:34 2024]
Finished job 0.
1 of 1 steps (100%) done
Traceback (most recent call last):
File "/home/dmolodenskiy/.conda/envs/sm310/lib/python3.10/weakref.py", line 667, in _exitfunc
f()
File "/home/dmolodenskiy/.conda/envs/sm310/lib/python3.10/weakref.py", line 591, in __call__
return info.func(*info.args, **(info.kwargs or {}))
File "/home/dmolodenskiy/.conda/envs/sm310/lib/python3.10/tempfile.py", line 868, in _cleanup
cls._rmtree(name, ignore_errors=ignore_errors)
File "/home/dmolodenskiy/.conda/envs/sm310/lib/python3.10/tempfile.py", line 864, in _rmtree
_shutil.rmtree(name, onerror=onerror)
File "/home/dmolodenskiy/.conda/envs/sm310/lib/python3.10/shutil.py", line 725, in rmtree
_rmtree_safe_fd(fd, path, onerror)
File "/home/dmolodenskiy/.conda/envs/sm310/lib/python3.10/shutil.py", line 658, in _rmtree_safe_fd
_rmtree_safe_fd(dirfd, fullname, onerror)
File "/home/dmolodenskiy/.conda/envs/sm310/lib/python3.10/shutil.py", line 658, in _rmtree_safe_fd
_rmtree_safe_fd(dirfd, fullname, onerror)
File "/home/dmolodenskiy/.conda/envs/sm310/lib/python3.10/shutil.py", line 658, in _rmtree_safe_fd
_rmtree_safe_fd(dirfd, fullname, onerror)
[Previous line repeated 3 more times]
File "/home/dmolodenskiy/.conda/envs/sm310/lib/python3.10/shutil.py", line 664, in _rmtree_safe_fd
onerror(os.rmdir, fullname, sys.exc_info())
File "/home/dmolodenskiy/.conda/envs/sm310/lib/python3.10/shutil.py", line 662, in _rmtree_safe_fd
os.rmdir(entry.name, dir_fd=topfd)
OSError: [Errno 39] Directory not empty: 'envs'
...which seems to be related to create_notebook.py script from AlphaPulldown and might be also related to this issue: https://github.com/KosinskiLab/AlphaPulldown/issues/379 @dingquanyu, could you check if create_notebook.py actually creates anything?
Hi Dingquan and Dima,
I ran the pipeline using features I created previously and manage to get the predictions. However, I also get the same errors above that the report crashes.
For me, features and predictions are created, but the reports crash. I think 'compute_stats' rule always fails due to this error:
rule compute_stats: input: /scratch/dima/fold_temp/predictions/Q8I2G6_Q8I5K4/completed_fold.txt output: /scratch/dima/fold_temp/predictions/Q8I2G6_Q8I5K4/statistics.csv jobid: 0 reason: Forced execution wildcards: fold=Q8I2G6_Q8I5K4 resources: mem_mb=8000, mem_mib=7630, disk_mb=1000, disk_mib=954, tmpdir=/scratch/jobs/6960313, walltime=1440, attempt=1 Activating singularity image /g/kosinski/dima/SnakeMake/AlphaPulldownSnakemake/.snakemake/singularity/3f20617ccba864758b2a437ef2fde35c.simg WARNING: Could not find any nv files on this host! I0713 09:26:31.047352 140737350492992 get_good_inter_pae.py:120] now processing Q8I2G6_Q8I5K4 E0713 09:26:31.510183 140737350492992 get_good_inter_pae.py:156] Error processing PAE and iPTM for job Q8I2G6_Q8I5K4: No module named 'alphafold' I0713 09:26:31.510915 140737350492992 get_good_inter_pae.py:166] done for Q8I2G6_Q8I5K4 1 out of 1 finished. I0713 09:26:31.510968 140737350492992 get_good_inter_pae.py:169] Unfortunately, none of your protein models had at least one PAE on the interface below your cu toff value : 100.0. Please consider using a larger cutoff. [Sat Jul 13 09:26:34 2024] Finished job 0. 1 of 1 steps (100%) done Traceback (most recent call last): File "/home/dmolodenskiy/.conda/envs/sm310/lib/python3.10/weakref.py", line 667, in _exitfunc f() File "/home/dmolodenskiy/.conda/envs/sm310/lib/python3.10/weakref.py", line 591, in __call__ return info.func(*info.args, **(info.kwargs or {})) File "/home/dmolodenskiy/.conda/envs/sm310/lib/python3.10/tempfile.py", line 868, in _cleanup cls._rmtree(name, ignore_errors=ignore_errors) File "/home/dmolodenskiy/.conda/envs/sm310/lib/python3.10/tempfile.py", line 864, in _rmtree _shutil.rmtree(name, onerror=onerror) File "/home/dmolodenskiy/.conda/envs/sm310/lib/python3.10/shutil.py", line 725, in rmtree _rmtree_safe_fd(fd, path, onerror) File "/home/dmolodenskiy/.conda/envs/sm310/lib/python3.10/shutil.py", line 658, in _rmtree_safe_fd _rmtree_safe_fd(dirfd, fullname, onerror) File "/home/dmolodenskiy/.conda/envs/sm310/lib/python3.10/shutil.py", line 658, in _rmtree_safe_fd _rmtree_safe_fd(dirfd, fullname, onerror) File "/home/dmolodenskiy/.conda/envs/sm310/lib/python3.10/shutil.py", line 658, in _rmtree_safe_fd _rmtree_safe_fd(dirfd, fullname, onerror) [Previous line repeated 3 more times] File "/home/dmolodenskiy/.conda/envs/sm310/lib/python3.10/shutil.py", line 664, in _rmtree_safe_fd onerror(os.rmdir, fullname, sys.exc_info()) File "/home/dmolodenskiy/.conda/envs/sm310/lib/python3.10/shutil.py", line 662, in _rmtree_safe_fd os.rmdir(entry.name, dir_fd=topfd) OSError: [Errno 39] Directory not empty: 'envs'
...which seems to be related to create_notebook.py script from AlphaPulldown and might be also related to this issue: KosinskiLab/AlphaPulldown#379 @dingquanyu, could you check if create_notebook.py actually creates anything?
KosinskiLab/AlphaPulldown#379 was caused when the user runs the script to analyse colabfold local results.
For me, features and predictions are created, but the reports crash. I think 'compute_stats' rule always fails due to this error:
rule compute_stats: input: /scratch/dima/fold_temp/predictions/Q8I2G6_Q8I5K4/completed_fold.txt output: /scratch/dima/fold_temp/predictions/Q8I2G6_Q8I5K4/statistics.csv jobid: 0 reason: Forced execution wildcards: fold=Q8I2G6_Q8I5K4 resources: mem_mb=8000, mem_mib=7630, disk_mb=1000, disk_mib=954, tmpdir=/scratch/jobs/6960313, walltime=1440, attempt=1 Activating singularity image /g/kosinski/dima/SnakeMake/AlphaPulldownSnakemake/.snakemake/singularity/3f20617ccba864758b2a437ef2fde35c.simg WARNING: Could not find any nv files on this host! I0713 09:26:31.047352 140737350492992 get_good_inter_pae.py:120] now processing Q8I2G6_Q8I5K4 E0713 09:26:31.510183 140737350492992 get_good_inter_pae.py:156] Error processing PAE and iPTM for job Q8I2G6_Q8I5K4: No module named 'alphafold' I0713 09:26:31.510915 140737350492992 get_good_inter_pae.py:166] done for Q8I2G6_Q8I5K4 1 out of 1 finished. I0713 09:26:31.510968 140737350492992 get_good_inter_pae.py:169] Unfortunately, none of your protein models had at least one PAE on the interface below your cu toff value : 100.0. Please consider using a larger cutoff. [Sat Jul 13 09:26:34 2024] Finished job 0. 1 of 1 steps (100%) done Traceback (most recent call last): File "/home/dmolodenskiy/.conda/envs/sm310/lib/python3.10/weakref.py", line 667, in _exitfunc f() File "/home/dmolodenskiy/.conda/envs/sm310/lib/python3.10/weakref.py", line 591, in __call__ return info.func(*info.args, **(info.kwargs or {})) File "/home/dmolodenskiy/.conda/envs/sm310/lib/python3.10/tempfile.py", line 868, in _cleanup cls._rmtree(name, ignore_errors=ignore_errors) File "/home/dmolodenskiy/.conda/envs/sm310/lib/python3.10/tempfile.py", line 864, in _rmtree _shutil.rmtree(name, onerror=onerror) File "/home/dmolodenskiy/.conda/envs/sm310/lib/python3.10/shutil.py", line 725, in rmtree _rmtree_safe_fd(fd, path, onerror) File "/home/dmolodenskiy/.conda/envs/sm310/lib/python3.10/shutil.py", line 658, in _rmtree_safe_fd _rmtree_safe_fd(dirfd, fullname, onerror) File "/home/dmolodenskiy/.conda/envs/sm310/lib/python3.10/shutil.py", line 658, in _rmtree_safe_fd _rmtree_safe_fd(dirfd, fullname, onerror) File "/home/dmolodenskiy/.conda/envs/sm310/lib/python3.10/shutil.py", line 658, in _rmtree_safe_fd _rmtree_safe_fd(dirfd, fullname, onerror) [Previous line repeated 3 more times] File "/home/dmolodenskiy/.conda/envs/sm310/lib/python3.10/shutil.py", line 664, in _rmtree_safe_fd onerror(os.rmdir, fullname, sys.exc_info()) File "/home/dmolodenskiy/.conda/envs/sm310/lib/python3.10/shutil.py", line 662, in _rmtree_safe_fd os.rmdir(entry.name, dir_fd=topfd) OSError: [Errno 39] Directory not empty: 'envs'
...which seems to be related to create_notebook.py script from AlphaPulldown and might be also related to this issue: KosinskiLab/AlphaPulldown#379 @dingquanyu, could you check if create_notebook.py actually creates anything?
This line reports the real crash here: OSError: [Errno 39] Directory not empty: 'envs' I don't know where it comes from? I think it's from snakemake itself as there's no step of removing directories in the script @DimaMolod
Hi @salomonssonj, and sorry for the long delay. I think we fixed the problem, and now Snakemake works for me for the test data sets, including the generation of reports. Please update the images and re-run your modeling again. Many thanks!
Hi, I also apologize for my delayed reply.
I tried it last week with the new singularity images and it seems that I still get some errors with the converting to modelcif format. I have attached logs files for generate_report and structure_inference. compute_stats-7749922.txt structure_inference-7737278.txt
I also wanted to try it out this morning but got this error message when executing the pipeline: snakemake: error: ambiguous option: --cluster=/home/salomonssonj/.config/snakemake/slurm_noSidecar/slurm-submit.py could match --cluster-generic-submit-cmd, --cluster-generic-status-cmd, --cluster-generic-cancel-cmd, --cluster-generic-cancel-nargs, --cluster-generic-sidecar-cmd, --cluster-sync-submit-cmd
I tried with adding --cluster-generic-submit-cmd home/salomonssonj/.config/snakemake/slurm_noSidecar/slurm-submit.py but I still got the same error.
Hi @salomonssonj, and thanks again for your feedback and patience :-)
Please try again with the fresh containers, the modelcif error should go away (anyway this issue is not critical and shouldn't prevent the execution of the main pipeline).
The error you encounter with the --cluster
flag probably indicate that you are using an outdated version of the snakemake. Could you try to update your snakemake e.g. to version 7.32.4
?
I also updated the instructions: for me it works after this single command:
https://github.com/KosinskiLab/AlphaPulldownSnakemake/blob/main/README.md?plain=1#L14
Please try it out and let me know if the error disappeared.
Hi @salomonssonj! I am closing this issue as I believe these bugs were solved, but meanwhile feel free to open a new issue here or in AP repo if you encounter any other problems with AlphaPulldown or Snakemake.
Hi @DimaMolod,
I'm sorry, this slipped from my mind after coming back from vaccation. I ran some tests and it works very well! Thank you for helping me out!
Kindly, Johannes
Hi,
I am encountering some errors when running this snakemake pipeline. I have attached a log file from one of my runs.
6416819.txt
I am able to generate the predictions and get the “ranked_x.pdb” files but then I get the error when trying to create the “completed_fold.txt” file. From the log file it seems to be related to the “convert_to_modelcif.py” script.
However, the predictions have rather bad scores, even for once we previously obtained high scoring predictions for.
It also says “/dev/null” after the input line, which I have not seen before. Could this be related?
Thank you in advance!
Kindly, Johannes