lemaslab / CAMP

predicting peptide-protein interactions

2 stars 2 forks source link

step2- Debugging #10

Open dlemas opened 1 year ago

dlemas commented 1 year ago

Our goal is to debug step2 (https://github.com/lemaslab/CAMP/blob/master/data_prepare/step2_pepBDB_pep_bindingsites.py).

Step 1: According to the "PDB ID-Peptide Chain-Protein Chain" obtained in "step1_pdb_process.py" , retrieve the interacting information with following fields:

Input File: pdb_pairs

do we need to subset or modify the file or can we use the step1 output?
@AnthonyYao7

("Peptide ID","Interacting peptide residues","Peptide sequence","Interacting receptor residues","Receptor sequence(s)") a

nd downloading the corresponding "peptide.pdb" files (please put under

Directory: data_prepare/step2/pepbdb-2020/pepbdb/$pdb_id$/peptide.pdb

please create the following directory structure. modify gitignore and send us the zip file via UF file transfer.
@evanhadam

Step 2: To map the peptide sequences from PepBDB to the peptide sequences from the peptide sequences from the RCSB PDB() generated in "step1_pdb_process.py").

code is located: https://github.com/lemaslab/CAMP/tree/PLIP/cluster/smith-waterman-src
what is the input
how to compile and run

Generate query (PepBDB version) sequence file called "query_peptide.fasta" & target (RSCB PDB) fasta sequence files called "target_peptide.fasta" for peptides

We use scripts under ./smith-waterman-src/ to align two versions of peptide sequences. The output is "alignment_result.txt"

python query_mapping.py #to get peptide sequence vectors (the output is "peptide-mapping.txt ")

python target_mapping.py #to get target sequence vector