Open johanneswerner opened 3 years ago
Hi @johanneswerner,
the option to generate an ARB file on the fly was meant to allow people unfamiliar with ARB to quickly generate a file SINA can use as a reference. The fasta file is parsed as >$ID $DESCRIPTION
with $ID
mapped to acc
and $DESCRIPTION
mapped to full_name
. That the SILVA FASTA files have $DESCRIPTION
== tax_slv
is just happenstance, and nothing SINA would know. Allowing people to customise this is a bit beyond what SINA is meant to do.
So in answer to 1: To create a custom ARB database, use ARB. You can start from a FASTA and import any fields you might like, split/copy parts of the FASTA header as needed, even add your own "import filter" to parse your type of FASTA header correctly.
In answer to 2: I don't know. Try with --copy-fields full_name
, so see what the original path was. Since it works with the SILVA database, but does not work with your custom database, it must be the format of the field. Feel free to post a (small) example ARB database here, I'll have a look whether there is something improvable on SINA's side that doesn't impact other use cases.
This is probably not a bug and maybe also documented somewhere, but I could not find any information about it.
I built a custom ARB database of a subset of the sequences from SILVA release 132 with the following command (after uncompressing):
By this, the created arb database has no taxonomy fields.
custom ARB database
official ARB database
I wanted to classify my sequences afterwards with the custom database, but since the field
tax_slv
does not exist, this results in an empty file. However, if I choose asfull_name
as LCA field, I get results but I do not get the entire taxonomic path.This is the result for one entry with
tax_slv
(and othertax_*
fields) with the official ARB databaseand here the same entry with the
full_name
field of the custom databaseI have no idea why the taxonomy looks so different, but what surprises me more is that there is no taxonomic path here.
So, long introduction, my question is:
full_name
as field in my custom ARB database?Thank you very much for your help!