twopin / CAMP

predicting peptide-protein interactions
117 stars 30 forks source link

Questions about creating "query_peptide.fasta" and "target_peptide.fasta" #27

Closed AnthonyYao7 closed 1 year ago

AnthonyYao7 commented 1 year ago

Hello!

I am working on running step2_pepbdb_pep_bindingsites.py but I am having a few problems with the inputs. I have successfully run crawl.py and have crawl_results.csv. However, I am confused about how to generate query_peptide.fasta and target_peptide.fasta. I understand that odd numbered lines should have information about the peptide and even numbered lines should have the sequence. What information should be included for each peptide?

Also, am I correct in saying that crawl.py should be run first, followed by query_mapping.py, then pyssw.py, then target_mapping.py?

Finally, is this the correct link to download pepbdb database: http://huanglab.phys.hust.edu.cn/pepbdb/db/download/

Thank you for your help!

twopin commented 1 year ago

You can use try some python cmds to convert sequence files into fasta (like try to identify the sequence name by starting with '>' and the next line should be the aa sequence.

AnthonyYao7 commented 1 year ago

Hi Thanks for your response! I did that. An example would be:

1qkz_P ANGGASGQVK

I run this through pyssw.py using the command listed in readme and the output file has this structure: target_name: 1qkz_P query_name: 5wrl_P optimal_alignment_score: 5
strand: +
target_end: 6
query_end: 6

However, when I run this through target-mapping.py, it fails to run. After modifying the code to run, the function get_result_dict returns an empty dictionary. Is the output of pyssw.py in the wrong format?

a30221274 commented 5 months ago

Hello

I've encountered a similar issue after running pyssw.py. May I ask how you managed to resolve it?