bdaisley / isolateR

Automated processing of Sanger sequencing data, taxonomic profiling, and generation of microbial strain libraries
Other
9 stars 1 forks source link

using a custom db #8

Closed fwhelan closed 2 months ago

fwhelan commented 2 months ago

Hello. I'm curious to try this with the SILVA 138.1 SSU ref database. Can you please give me a bit more information as to what is required in the header of each sequence? At the moment, I have gotten it to work but with an offset- the Class information is showing in the Phylum column. Thank you!

bdaisley commented 2 months ago

Hi @fwhelan - the FASTA header for custom databases should be formated as: "Accession_no;dDomain;pPhylum;cClass;oOrder;fFamily;gGenus;s__Species"

To get an example file for reference, you can try the following which will make a custom-style database for archaea reference sequences off NCBI:

get_db(db="16S_arc", add_taxonomy=TRUE)

We're working on getting more detailed instructions/tutorials for all the functions including usage of custom databases. This should be available within the next few weeks.

However, let me know if this helps solve your question in the meantime.

fwhelan commented 2 months ago

Thanks, bdaisley! That should be enough for me to go on! I'll try it out in the next few days and let you know if I run into any issues.

fwhelan commented 1 month ago

Worked a treat, thanks again!