lehtiolab / proteogenomics-analysis-workflow

IPAW: a Nextflow workflow for proteogenomics
24 stars 9 forks source link

removed max_target_seqs option in blastp process #12

Open husensofteng opened 4 years ago

husensofteng commented 4 years ago

Remove the -max_target_seqs 1 parameter in blasp process to avoid early stopping of the algorithm that makes it not find the top best matches.

See the references here and here

https://github.com/lehtiolab/proteogenomics-analysis-workflow/blob/3472dcbaf8b3d20d41a7113519dff691c0a4b0b7/main.nf#L698

husensofteng commented 4 years ago

Actually, it turns out there can still be canonical sequences reported as not mapping in blastp. I have tested the following parameters that seem to find the matches properly: -evalue 10000 -comp_based_stats 0 -ungapped