widdowquinn / Teaching-IBioIC-Intro-to-Bioinformatics

Course materials for the IBioIC Introduction to Bioinformatics training course
https://widdowquinn.github.io/Teaching-IBioIC-Intro-to-Bioinformatics/
10 stars 7 forks source link

Calling QBLAST with sequence vs FASTA entry? #55

Open peterjc opened 7 years ago

peterjc commented 7 years ago

Currently 02-sequence_databases/04-programming_for_web_blast.ipynb and 03-lipases/XX-runthrough.ipynb use this pattern for calling QBLAST:

result_handle = NCBIWWW.qblast("blastx", "pdbaa", record.format("fasta"), format_type="Text")

Rather than a full FASTA record with an identifier etc, you can just give the raw string (which BLAST will give a default query name to):

result_handle = NCBIWWW.qblast("blastx", "pdbaa", record.seq, format_type="Text")

Is this slight simplification of the code worthwhile? As they all seem to use text output it seems harmless.

widdowquinn commented 7 years ago

Yes. I think the .format syntax and str() syntax are equally complicated for the newcomer, but the .format() implies other formats are possible. I think it's kind of eeksy-peeksy about which we use. The suggested replacement is simpler, which has advantages.

peterjc commented 7 years ago

You don't actually need to do str(record.seq) here, record.seq alone works fine.