Closed maruiqi0710 closed 9 months ago
Hello,
The --allow-same-species
parameter determines whether GX reports alignments to sequences in the GX database corresponding to organisms with the identical tax-id supplied by the user. It is turned on by default, and for most use cases should just be left alone.
If you set --allow-same-species=F
, one potential effect is that more contaminants get reported, particularly in cases where there is poor taxonomic representation in the database close to the source genome. But the parameter can be useful in cases where you suspect that the database sequences themselves might be contaminated.
You can also achieve the same effect by using the environment variable GX_ALIGN_EXCLUDE_TAXA=tax-id
, where tax-id
is the taxonomic identifier corresponding to the source genome. (see: https://github.com/ncbi/fcs/wiki/FCS-GX#environment-variables)
Closing. Please re-open if additional assistance is needed.
I noticed that there is a
--allow-same-species
option infcs gx screen genome
. What is the effect of this option? Will it lead to stricter screening conditions (more sequences are assumed to be contaminants) or looser screening conditions (fewer sequences are assumed to be contaminants)?