Open peterjc opened 1 year ago
Hm, this is going to be a delicate one. If type=None
raises, it at a minimum triggers a minor release as this is breaking behavior for the current API and it will create headaches for many users of the library. I'm inclined to, if None
, to default to "OTU table" and raise a warning with a deprecation notice that this behavior will be unsupported in the future.
That is correct the validation should be consistent -- good catch.
I agree that's a good plan - add a deprecation warning in a revision release of the tool, and upgrade to an exception in a minor release.
At least fixing the HDF5 validation to be stricter shouldn't be so complicated 😃
I found this bug because (following the examples), I'd not set the table type myself. In my case "OUT table" is the best match from the options defined.
Ah yes the classic "OUT table", which coincidentally is a common interpretation by Word :)
In some offline discussion with @wasade, he suggested that maybe we could grandfather type=None
in as allowable by the format. I'm in favor of that as it hasn't caused any problems to-date (that I'm aware of). Changing the behavior would cause other tools to need updating (e.g., QIIME 2, which doesn't set this value - see below), and would make old biom-formatted tables not work with new versions of the software.
In [1]: import qiime2
In [2]: import biom
In [3]: t = qiime2.Artifact.load('table.qza').view(biom.Table)
In [5]: print(t.type)
None
Thank you, @gregcaporaso!
Also using biom.Table.from_tsv(...)
does not have a type=
argument to set the table type.
Reproducible example
Example code based on https://biom-format.org/documentation/table_objects.html#examples
Follow this by command line validation of the output:
Actual behaviour:
Python script runs without error (bad).
HDF5 file passess validation (bad):
JSON file fails validation (good):
Expected behaviour:
Runtime error during
Table.__init__
since defaults includetype=None
andvalidate=True
by default, and for all BIOM formats to date, type is a required top level attribute.https://biom-format.org/documentation/format_versions/biom-1.0.html https://biom-format.org/documentation/format_versions/biom-2.0.html https://biom-format.org/documentation/format_versions/biom-2.1.html
Furthermore, using the example BIOM files as generated without the table type, both the JSON and the HDF5 ought to fail consistently.