An example SQL query confirming the number of sequences:
SELECT CONCAT(t1.rfamseq_acc, '/', seq_start, '-', `seq_end`)
FROM full_region t1, rfamseq t2, taxonomy t3
WHERE t1.rfam_acc = 'RF01315'
AND t1.rfamseq_acc = t2.rfamseq_acc
AND t2.ncbi_id = t3.ncbi_id
AND t3.tax_string LIKE '%Clostridia;%'
AND is_significant = 1
GROUP BY rfamseq_acc;
64 rows (like in sunburst UI) - note the GROUP BY clause
However, there are many more annotated regions:
SELECT CONCAT(t1.rfamseq_acc, '/', seq_start, '-', `seq_end`)
FROM full_region t1, rfamseq t2, taxonomy t3
WHERE t1.rfam_acc = 'RF01315'
AND t1.rfamseq_acc = t2.rfamseq_acc
AND t2.ncbi_id = t3.ncbi_id
AND t3.tax_string LIKE '%Clostridia;%'
AND is_significant = 1;
6222 rows - no GROUP BY clause
So the number of entries in the resulting FASTA file is inconsistent with sunburst UI.
Example
Species sunburst for Clostridia in RF01315 shows that there are 64 sequences:
An example SQL query confirming the number of sequences:
64 rows (like in sunburst UI) - note the GROUP BY clause
However, there are many more annotated regions:
6222 rows - no GROUP BY clause
So the number of entries in the resulting FASTA file is inconsistent with sunburst UI.