I have an mudata/anndata dataset exported with anndata=0.7.8.
When trying to read it, I get the error reading the var:
Error in factor(as.integer(values), labels = labels_items): invalid 'labels'; length 4732 should be 1 or 4733
Traceback:
1. readH5AD(file)
2. read_modality(h5, backed)
3. read_with_index(h5autoclose(view & "var"))
4. read_dataframe(dataset)
5. lapply(columnorder, function(name) {
. col <- group & name
. values <- read_attribute(col)
. if (H5Aexists(col, "categories")) {
. attr <- H5Aopen(col, "categories")
. labels <- H5Aread(attr)
. if (!is(labels, "H5Ref")) {
. warning("found categories attribute for column ",
. name, ", but it is not a reference")
. }
. else {
. labels <- H5Rdereference(labels, h5loc = col)
. labels_items <- H5Dread(labels)
. n_labels <- length(unique(values))
. if (length(labels_items) > n_labels) {
. labels_items <- labels_items[seq_len(n_labels)]
. }
. values <- factor(as.integer(values), labels = labels_items)
. H5Dclose(labels)
. }
. H5Aclose(attr)
. }
. H5Dclose(col)
. values
. })
6. FUN(X[[i]], ...)
7. factor(as.integer(values), labels = labels_items)
8. stop(gettextf("invalid 'labels'; length %d should be 1 or %d",
. nlab, length(levels)), domain = NA)
I found the reason was that the column contains NA that are represented as -1 in the categorical values but do not have a matching label in the categories.
I have an
mudata
/anndata
dataset exported withanndata=0.7.8
.When trying to read it, I get the error reading the
var
:I found the reason was that the column contains NA that are represented as -1 in the categorical values but do not have a matching label in the categories.
Would you be interested in a PR with a fix?