ELELAB / mutatex

scripts and facilities for in-silico mutagenesis with FoldX
GNU General Public License v3.0
49 stars 8 forks source link

PDB Parsing Error while using ddg2excel #182

Closed PaulSchrank closed 7 months ago

PaulSchrank commented 7 months ago

Dear the wonderful people from ELELAB,

while trying to run ddg2excel after a finished mutatex run I get the following Error:

Matplotlib created a temporary config/cache directory at /tmp/matplotlib-gmn3dyem because the default path (/home/pschrank/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing. ERROR: Position B340 was not identified in the input PDB files. Exiting... Traceback (most recent call last): File "/home/pschrank/Programms/mutatex/mutatex-env/bin/ddg2excel", line 4, in import('pkg_resources').run_script('MutateX==0.8', 'ddg2excel') File "/home/pschrank/Programms/mutatex/mutatex-env/lib64/python3.6/site-packages/pkg_resources/init.py", line 666, in run_script self.require(requires)[0].run_script(script_name, ns) File "/home/pschrank/Programms/mutatex/mutatex-env/lib64/python3.6/site-packages/pkg_resources/init.py", line 1469, in run_script exec(script_code, namespace, namespace) File "/home/pschrank/Programms/mutatex/mutatex-env/lib64/python3.6/site-packages/MutateX-0.8-py3.6.egg/EGG-INFO/scripts/ddg2excel", line 80, in File "/home/pschrank/Programms/mutatex/mutatex-env/lib64/python3.6/site-packages/MutateX-0.8-py3.6.egg/mutatex/utils.py", line 252, in filter_reslist TypeError

For running ddg2excel I use the following command:

ddg2excel -p RgTAL_Holo_model0_checked.pdb -d results/mutation_ddgs/final_averages/ -l mut_list_ssm.txt -q pos_list.txt -o RgTAL_active_SSM -F csv -M

(I provide all files in the Gigamove link: https://gigamove.rwth-aachen.de/de/download/a024c4cbf052ae3c9e6c110d48efb7c0)

It seem like there is a compatibility issue between my input PDB structure and the position list file, but I don't know where it stems from, it worked totally fine when carrying out the SSM run. In this case the protein is a homo-tetramer, maybe this produces the issue, but this should be treated correctly by providing the -M option?

Do you have maybe an idea for troubleshooting?

Best regards and a pleasant weekend,

Paul

mtiberti commented 7 months ago

hi @PaulSchrank,

I would indeed need to look at your calculation to check - I am trying to download the zip file to check from the website you linked, but nothing happens upon clicking the Download button, and the connection to the website just stalls out. I have tried with Safari, Chrome and Firefox, same result. Any suggestion?

PaulSchrank commented 7 months ago

Hey @mtiberti,

I apologize I don't really know why you are not able to download. Maybe because it is a File Sharer from german university? But tilll now it didn't produce any issues :(

Anyhow here is a new link from my google drive, hopefully you can access it from there:

PaulSchrank commented 7 months ago

Ups sent it prematurely :D

https://drive.google.com/file/d/15TbUGl07HbWP7fw8DMj62nPynQJl0PU9/view?usp=drive_link

Thanks for your help! Best regards Paul

mtiberti commented 7 months ago

Thanks @PaulSchrank that worked! I'll ping you once I know something - probably not before Thursday

mtiberti commented 7 months ago

hi @PaulSchrank I just had a cursory look. A few notes:

Using the latest version of MutateX in your RgTAL folder I get:

ddg2excel -p RgTAL_Holo_model0_checked.pdb -d results/mutation_ddgs/final_averages/ -l mut_list_ssm.txt -q position_list.txt -o RgTAL_active_SSM -F csv -M
WARNING: Residue <Residue NME het=H_NME resseq=42 icode= > couldn't be recognized; it will be skipped
WARNING: Residue <Residue ACE het=H_ACE resseq=42 icode= > couldn't be recognized; it will be skipped
WARNING: Residue <Residue NME het=H_NME resseq=87 icode= > couldn't be recognized; it will be skipped
...
ERROR: HB114 residue is not written in the right format or it is not contained in pdbfile

so no crash for me - but still unexpected output, since HB114 is in the PDB and the format looks correct. I'll investigate further when I have more time.

A similar command actually works and generate the desired output file for the RgTAL_active folder

PaulSchrank commented 7 months ago

Hey @mtiberti,

sorry for the late response! I apologize for the confusion, the Error was meant for the RgTAL_active folder and the RgTAL folder seemed to slip in when zipping the files! In the RgTAL folder I had some issues with the PDB as you mentioned (I don't really know, where they stem from, I prepared the structures with MOE and YASARA and I guess while parsing the structures during Loadup it messed with the files) so I only got a messed up results for chain A when running with mutateX. So I re-prepared the structure and only carried out the run in RgTAL_active using the position list for my preasumed active site residues. I tried there again with your suggested command, but still got a error:

~/Programms/mutatex/RgTAL_active> ddg2excel -p RgTAL_Holo_model0_checked.pdb -d results/mutation_ddgs/final_averages/ -l mut_list_ssm.txt -q pos_list.txt -o RgTAL_active_SSM -F csv -M Matplotlib created a temporary config/cache directory at /tmp/matplotlib-2hbk3y because the default path (/home/pschrank/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing. ERROR: Position A247 was not identified in the input PDB files. Exiting... Traceback (most recent call last): File "/home/pschrank/Programms/mutatex/mutatex-env/bin/ddg2excel", line 4, in import('pkg_resources').run_script('MutateX==0.8', 'ddg2excel') File "/home/pschrank/Programms/mutatex/mutatex-env/lib64/python3.6/site-packages/pkg_resources/init.py", line 666, in run_script self.require(requires)[0].run_script(script_name, ns) File "/home/pschrank/Programms/mutatex/mutatex-env/lib64/python3.6/site-packages/pkg_resources/init.py", line 1469, in run_script exec(script_code, namespace, namespace) File "/home/pschrank/Programms/mutatex/mutatex-env/lib64/python3.6/site-packages/MutateX-0.8-py3.6.egg/EGG-INFO/scripts/ddg2excel", line 80, in File "/home/pschrank/Programms/mutatex/mutatex-env/lib64/python3.6/site-packages/MutateX-0.8-py3.6.egg/mutatex/utils.py", line 252, in filter_reslist TypeError

It still seems to not recognize the position list correctly and tries to find an "A247" as in Alanine instead of the "NA247" as in the Asparagin of chain A :(

Could it be that there is a problem with my biopython version, that produces these issues? I run mutateX on Python 3.6.15 and these are the packages I have installed in the environment:

Package Version


Thank you kindly for your help and have a pleasant day

Paul

mtiberti commented 7 months ago

thanks for the clarification @PaulSchrank , are you using the latest version of MutateX from the master branch of the repository? I suspect not since raise TypeError is line 262 nowadays and there's no way line 252 could trigger a TypeError. If not, can you try and install the latest version and see if the error persists?

PS: you should probably install Biopython 1.78 because of #176 (that we are currently addressing in #183, but it's not merged as of yet)

PaulSchrank commented 7 months ago

Hey @mtiberti, I tried it with a fresh installation and it worked wonderful! Thank you very much for the help and have a nice weekend! :D

Best regards

Paul

(P.S. the issue could be closed now!)

mtiberti commented 7 months ago

great to hear! Always happy when things work out ;)

and a great weekend to you