Closed apascualgarcia closed 4 years ago
When you say each otu has at least 100 reads across all samples, do you mean for the ~4500 otus or the ~400 otus? Is that 100 reads per observation/sample, or 100 reads total?
In general, the ability to reconstruct a trajectory is affected by the level of sparseness. The larger the proportion of zero entries, the less information available to reconstruct a trajectory. In the worst case, an OTU is never observed (has all zero counts), and there is no information available. I would expect the method to work with intermediate levels of sparseness (e.g. fewer than 30-40% zeros), but have difficulty at high levels of sparsity (e.g. 5-10%).
Thanks for the fast answer,
Each of the ~4500 otu (actually exact sequence variant) has at least 100 reads total across all samples. The table table has 70% zeros so that may be the case:
Num samples: 36 Num observations: 4.438 Total count: 9.891.726 Table density (fraction of non-zero values): 0.296
Considering OTUs (97% SI) increases substantially the density (and it works):
Num samples: 36 Num observations: 452 Total count: 7.933.920 Table density (fraction of non-zero values): 0.586
Perhaps it would be worth documenting this limitation? I am happy to send you both tables if it is of any help.
Thanks for sending the tables.
I added a warning message about data sparsity. While it doesn't resolve the error, it should make it easier to diagnose.
Thanks for taking the time to identify the problem!
Dear @tyjo,
I am trying to estimate the relative abundances for a large dataset (~4500 otus) and I get the following error:
Aggregating these otus into higher taxa (~400) runs properly, is it any limitation in the number of otus or in the table sparseness? In the example, all otus have at least 100 reads across all samples (36).
Cheers