Open IvanUgrin-Genalytics opened 1 week ago
this shoudl work
The problem is that these are nucleotide gene annotations from the Silva database. I suppose BioSAK expects annotated protein sequences but I have Nucleotide annotated sequences. The main question is: Is it possible to translate the Nucleotide to Protein sequences and annotate them to be used with BioSAK and do you know of a pipeline that does that? @songweizhi
Hello. I am trying to analyze a DNA sequence fasta file for COGs. Is it possible to use a fasta file for that purpose? The file is aligned with gene NCBI IDs.
Example command:
BioSAK COG2020 -m N -t 6 -db_dir ./COG_db_dir -i input.ffn
Example input format of fasta:
>AB679109.1 GTGCCAGCAGCCGCGGTAATACGTAGGTGGCAAGCGTTGTCCGGAATTATTGGGCGTAAAGCGCGCGCAGGCGGATCAGTCAGTCTGTCTTAAAAGTTCGGGGCTTAACCCCGTGATGGGATGGAAACTGCTGATCTAGAGTATCGGAGAGGAAAGTGGAATTCCTAGTGTAGCGGTGAAATGCGTAGATATTAGGAAGAACACCAGTGGCGAAGGCGACTTTCTGGACGAAAACTGACGCTGAGGCGCGAAAGCCAGGGGAGCGAACGGGATTAGAAACCCCAGTAGTCC