Closed pierrepo closed 3 months ago
Sorry @KarinDuong but why are you writing a FASTA file? Shouldn't you just read a fasta file?
@pierrepo yes I read a fasta file to collect all the sequence, and re-write a fasta file to put the PDB ID in the line begin with ">" for each sequence
ok @KarinDuong No need to revrite the fasta with the PDB ID.
@pierrepo so how to store these PDB ID ? and how we going to known to which sequence it's related ?
As discuss last time, we might have multiple PDB structures that match the target sequence. So it's better to just output in the terminal something like this:
FASTA sequence 1: <20-first-amino-acids> Putative structure:
Putative PDB IDs: <5 or 10 best PDB IDs with best match>
https://github.com/pierrepo/grodecoder/blob/e4e072758297926ea5a6a04b1ae9050a66e31b92/fasta_seq_into_PDBID.py#L6
Could you use biopython to read the fasta file?