Bioconductor / HDF5Array

HDF5 backend for DelayedArray objects
https://bioconductor.org/packages/HDF5Array
9 stars 13 forks source link

Long colnames get truncated #24

Closed FelixTheStudent closed 4 years ago

FelixTheStudent commented 4 years ago

Writing out a sparse Matrix seems to work fine (no warnings). However, once I read it back it, colnames get cropped so that they are not longer than 11 characters (in the minimal example below at least; I have had another case where colnames were not longer than 10).

I use HDF5Array 1.12.3.

library(Matrix)
library(HDF5Array)

# create example matrix with increasingly long colnames:
mat <- as(matrix(rep(0, 900), nrow=30), "dgCMatrix")
mat[sample(1:30, 3), sample(1:30, 3)] <- 1
colnames(mat) <- sapply(1:30, function(i) paste(rep("x", i), collapse = ""))
# write out:
HDF5Array::writeTENxMatrix(mat,
                filepath=file.path("~", "tmp_file.hdf5"),
                group="tmp", level=NULL, verbose=F)
x <- HDF5Array::TENxMatrix("~/tmp_file.hdf5", group="tmp")
max(stringr::str_length(colnames(mat)))
# 30
max(stringr::str_length(colnames(x)))
# 11

Any ideas?

hpages commented 4 years ago

Thanks Felix. Should be fixed in HDF5Array 1.15.7 (see commit 0e487076). Will port the fix to BioC 3.10 later today.

H.

FelixTheStudent commented 4 years ago

Excellent, thanks. Will tell my admin to update bioconductor.

Cheers,

Felix

Hervé Pagès notifications@github.com schrieb am Fr., 28. Feb. 2020, 12:06:

Thanks Felix. Should be fixed in HDF5Array 1.15.7 (see commit 0e48707 https://github.com/Bioconductor/HDF5Array/commit/0e487076bb7faf028a5fd54bd6c7579ce12b268c). Will port the fix to BioC 3.10 later today.

H.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Bioconductor/HDF5Array/issues/24?email_source=notifications&email_token=ACQASS2A5IAVCEVQR53GXJ3RFDV4HA5CNFSM4K5M3AX2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOENIE56A#issuecomment-592465656, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACQASS4LPPAYN4HLQL6L75LRFDV4HANCNFSM4K5M3AXQ .

hpages commented 4 years ago

Sounds good. I've ported the fix to BioC 3.10. It's in HDF5Array 1.14.3 but please allow 24h for the package to pass the builds and become available via BiocManager::install().

Note that admins should update Bioconductor on a regular basis regardless (dozens of packages get fixed every month in release).

Cheers, H.