Problems in the STEP 2: Generate labels of peptide binding residues

twopin / CAMP

predicting peptide-protein interactions

117 stars 30 forks source link

Problems in the STEP 2: Generate labels of peptide binding residues #8

Closed Alyssa-vv closed 1 year ago

Alyssa-vv commented 2 years ago

Anyone can help with the step2?

How to generate the file: './pdb/peptide-mapping.txt'? I can not find the "query_mapping.py" and "target_mapping.py ".

"#python query_mapping.py #to get peptide sequence vectors (the output is "peptide-mapping.txt ")" "#python target_mapping.py #to get target sequence vector"

Thank you!

twopin commented 2 years ago

Hi, Thank you for your interests in our work. I uploaded these two scripts now.

liuzhelz commented 2 years ago

Anyone can help with the step2?

How to generate the file: './pdb/peptide-mapping.txt'? I can not find the "query_mapping.py" and "target_mapping.py ".

"#python query_mapping.py #to get peptide sequence vectors (the output is "peptide-mapping.txt ")" "#python target_mapping.py #to get target sequence vector"

Thank you!

Hi, @twopin @Alyssa-vv , i have some trouble in dada_prepare/step_2, i can not find crawl_results.csv in the module data_prepare/query-mapping.py, could you give me some help?

Thank you!

twopin commented 2 years ago

Sorry I did not save the intermediate file but you can use your own data. The important part of the script begins from line 49. You can adjust the script according to your own data format.

liuzhelz commented 2 years ago

Sorry I did not save the intermediate file but you can use your own data. The important part of the script begins from line 49. You can adjust the script according to your own data format.

csvfile=open('crawl_results.csv','r')
reader=csv.reader(csvfile)
residue_dict={}
seq_dict={}

for item in reader:
    if reader.line_num==1:  
        continue
    qid=item[0]
    querys.append(qid)
    pep_index=item[1].split(': ') #prot_index=item[3].split(': ')
    residue_dict[item[0]]=pep_index[1]
    seq_dict[item[0]]=item[2].split(': ')[1]#seq_dict[item[0]]=item[4].split(': ')[1]

Hi,@twopin Thank you for your reply. Now I have 'interacted peptide protein pairs from PDB', 'PLIPresults', 'pdbid all fasta' and 'pdb pep_ Chain' files generated from step_1 and PepBDB data set mentioned in the paper, but I don't know what data should be used to generate the 'crawl_results.csv' file？What is the function of this file.

Thank you!

NingNing-C commented 2 years ago

Hi @liuzhelz,

I also have the same problem, did you solve it?

Many thanks!

twopin commented 2 years ago

Hi @liuzhelz @NingNing-C Sorry that I didn't reply earlier. I've uploaded a script called crawl.py to generate such results.

NingNing-C commented 2 years ago

Hi @twopin, Thanks so much for your quick reply, but I can't find the script crawl.py, could you please check it?

Hi @liuzhelz @NingNing-C Sorry that I didn't reply earlier. I've uploaded a script called crawl.py to generate such results.