czbiohub-sf / nf-predictorthologs

*de novo* orthologous gene predictions from bam + bed or fasta/fastq data
MIT License
4 stars 2 forks source link

Potentially use kaamer instead of diamond #32

Open olgabot opened 4 years ago

olgabot commented 4 years ago

Identification of proteins is one of the most computationally intensive steps in genomics studies. It usually relies on aligners that don’t accommodate rich information on proteins and require additional pipelining steps for protein identification. We introduce kAAmer, a protein database engine based on amino-acid k-mers, that supports fast identification of proteins with complementary annotations. Moreover, the databases can be hosted and queried remotely.

olgabot commented 4 years ago

Thanks @bluegenes !