mojaveazure / seurat-disk

Interfaces for HDF5-based Single Cell File Formats
https://mojaveazure.github.io/seurat-disk
GNU General Public License v3.0
156 stars 50 forks source link

Convert H5Seurat to AnnDisk failure when metadata col has all NA #126

Open bbimber opened 2 years ago

bbimber commented 2 years ago

Here's a simply repro example:

library(SeuratData)
library(SeuratDisk)

InstallData("pbmc3k")
data("pbmc3k.final")
pbmc3k.final

# This is not strictly needed but makes it faster:
pbmc3k.final <- DietSeurat(pbmc3k.final)

# Now add factor with all NAs
pbmc3k.final$NACol <- NA
pbmc3k.final$NACol <- as.factor(pbmc3k.final$NACol)

# This works
SaveH5Seurat(pbmc3k.final, filename = "pbmc3k.h5Seurat")

# This fails:
Convert("pbmc3k.h5Seurat", dest = "h5ad")

The error is:

> Convert("pbmc3k.h5Seurat", dest = "h5ad")
Validating h5Seurat file
Adding data from RNA as X
Transfering meta.features to var
Adding counts from RNA as raw
Transfering meta.features to raw/var
Transfering meta.data to obs
Error in self$read_low_level(file_space = self_space_id, mem_space = mem_space_id,  : 
  HDF5-API Errors:
    error #000: D:/a/rtools-packages/rtools-packages/mingw-w64-hdf5/src/hdf5-1.8.16/src/H5D.c in H5Dvlen_reclaim(): line 832: invalid argument
        class: HDF5
        major: Invalid arguments to routine
        minor: Bad value

One can avoid this by dropping the column(s) prior to serializing, but I thought I'd report this.