Open nick-youngblut opened 3 years ago
Thanks for reporting, @nick-youngblut. I think this error might be originating in the biom package, would you mind running this little bit of python code (using the offending data) to see if you can recreate it in pure biom?
import qiime2
import biom
artifact = qiime2.Artifact.load('table.qza')
table = artifact.view(biom.Table)
table.subsample(500000, axis='sample', by_id=False, with_replacement=True)
I can't seem to reproduce the error, so it appears to occur rarely.
Thanks @nick-youngblut. I don't think this issue can be resolved in this QIIME 2 plugin - the rarefy
method just wraps biom, so I'll keep this open for now, in case you find a more reliable test case. Thanks!
I ran into the issue again, and I was able to confirm that the issue is caused by biom:
Python 3.6.10 | packaged by conda-forge | (default, Apr 24 2020, 16:42:08)
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import qiime2
>>> import biom
>>> artifact = qiime2.Artifact.load('otu.qza')
table = artifact.view(biom.Table)
>>> table.subsample(500000, axis='sample', by_id=False, with_replacement=True)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/ebio/abt3_projects/test_project/bin/llmgp/.snakemake/conda/5b653c1a/lib/python3.6/site-packages/biom/table.py", line 2824, in subsample
_subsample(data, n, with_replacement)
File "biom/_subsample.pyx", line 53, in biom._subsample._subsample
File "mtrand.pyx", line 4214, in numpy.random.mtrand.RandomState.multinomial
ValueError: sum(pvals[:-1]) > 1.0
I guess that I should post the issue on https://github.com/biocore/biom-format
See the link above -- @nick-youngblut maybe it is possible that you had fractional values in the biom table? Rounding to ints seems to have resolved this issue.
Thank you, @mortonjt, for opening the issue on the biom-format tracker. I was unaware of this edge case, we'll look at getting it addressed in the next release.
This issue was addressed in https://github.com/biocore/biom-format/pull/961 and it may make sense to close this issue.
As a general comment, please do consider opening issues when appropriate with affected projects so problems can be resolved in a timely manner.
Bug Description Running
qiime feature-table rarefy --p-with-replacement
sometimes generated the error:sum(pvals[:-1]) > 1.0
This is likely a float rounding issue.
Steps to reproduce the behavior
qiime feature-table rarefy --p-with-replacement --p-sampling-depth 500000
The counts per sample for my feature table are:
...so it's not a problem that just occurs with a very small sample size (eg., n = 1 or n = 10)
Computation Environment
qiime2 2020.8.0