samsledje / D-SCRIPT

A structure-aware interpretable deep learning model for sequence-based prediction of protein-protein interactions
http://dscript.csail.mit.edu
MIT License
87 stars 18 forks source link

extract-3di command fail #54

Closed twaksman001 closed 1 year ago

twaksman001 commented 1 year ago

extract-3di_textprinted.txt

samsledje commented 1 year ago

Can you please run the following commands and let me know at what point it fails and what output you get?

mkdir /tmp/foldseek_tmpdir foldseek createdb ../data/Mp PDB/Mp41_AF2.pdb ../data/Mp PDB/Mp23_AF2.pdb /tmp/foldseek_tmpdir/test_run ls /tmp/foldseek_tmpdir head /tmp/foldseek_tmpdir/test_run_ss

twaksman001 commented 1 year ago

(dscript) [twaksman001@login1 D-SCRIPT]$ foldseek createdb '../data/Mp PDB/Mp41_AF2.p db' '../data/Mp PDB/Mp23_AF2.pdb' '/tmp/foldseek_tmpdir/test_run' createdb ../data/Mp PDB/Mp41_AF2.pdb ../data/Mp PDB/Mp23_AF2.pdb /tmp/foldseek_tmpdir /test_run

MMseqs Version: 7.04e0ec8 Chain name mode 0 Write mapping file 0 Mask b-factor threshold 0 Coord store mode 2 Write lookup file 1 Tar Inclusion Regex .* Tar Exclusion Regex ^$ Threads 8 Verbosity 3

Output file: /tmp/foldseek_tmpdir/test_run [=================================================================] 100.00% 2 0s 0ms Time for merging to test_run_ss: 0h 0m 0s 0ms Time for merging to test_run_h: 0h 0m 0s 0ms Time for merging to test_run_ca: 0h 0m 0s 0ms Time for merging to test_run: 0h 0m 0s 0ms Ignore 0 out of 2. Too short: 0, incorrect: 0, not proteins: 0. Time for processing: 0h 0m 0s 37ms (dscript) [twaksman001@login1 D-SCRIPT]$ ls /tmp/foldseek_tmpdir test_run test_run.dbtype test_run.index test_run_ss.dbtype test_run_ca test_run_h test_run.lookup test_run_ss.index test_run_ca.dbtype test_run_h.dbtype test_run.source test_run_ca.index test_run_h.index test_run_ss (dscript) [twaksman001@login1 D-SCRIPT]$ head /tmp/foldseek_tmpdir/test_run_ss DDDPPPPDPDPPDDDDWAKPDKDWDDDPQAKIKIWTATNQGKIKIKIKGWDPPPDPQIKIKMWMKIWHQDPVRDIWIKTWIAIPP PGIDIDTPPDDDDDDDDVVVVVVVVVVVVVVVVPPADADPVRDRPPPPDDD DDCPPDDQWDADPDQDPVLVVQLVVLVVVLVVVVVVVVVVVVVVPCPPPPPPPPPDPPDDDPPPVVVVVVLLSQLLSLLSSLVVQ VQADPQSQGDLVSQLCNRCPPPDTPLLSVQLSVQQVVLSVVVQVVLVPDPVSGDDSSVSSNSSSSSVVSSVVSSCVRNPDPPPPD DPVDDDD

twaksman001 commented 1 year ago

any idea what the problem is?

could it be due to "_ss" not being part of the temporary directory name?

samsledje commented 1 year ago

Sorry for the late reply! The _ss is the file containing the 3Di sequences, which should be generated by FoldSeek in the temporary directory. It seems that foldseek is running properly given the test commands you sent above -- which means I'm unsure why your original run failed. Could you re-run the original command giving you the issues, and share with me the contents of the temporary directory? This file path is the last part of the foldseek command which is printed ("/tmp/1300970.1.all.q/tmp3aozx6xb/1446002179529316489" in your original error log)

twaksman001 commented 1 year ago

now, the same problem is occurring, and it appears that the /tmp directory does not contain a directory called "[job ID].1.all.q". I seem to remember that this was not the case when I tried a couple of weeks ago, I think I found the temporary directory in the /tmp directory. Also, trying the foldseek command you suggested in your previous comment returned error this time: Could not open /tmp/foldseek_tmpdir/test_run_ss.0 for writing!

samsledje commented 1 year ago

Can you provide more details about your operating system/environment? We use the Python tempfile API to create and write to temporary directories -- if you're unable to create these temporary directories, there might be some permissions error.

twaksman001 commented 1 year ago
twaksman001 commented 1 year ago

information from the HPC managers:

Could I make some change to the D-SCRIPT .py files to make the Python tempfile package write to my own directory?

twaksman001 commented 1 year ago

Everything worked using a different HPC.

samsledje commented 1 year ago

Great, let me know if you have any other questions!