Closed lazappi closed 5 years ago
Hi,
I am having exactly the same problem. The cause appears to be that export() write row data as GRangesList, which leads to relocation of hdf5 attributes as well as loss of information:
> l1_file <- system.file("extdata", "L1_DRG_20_example.loom", package = "LoomExperiment")
> scle <- import(l1_file, type="SingleCellLoomExperiment")
> export(scle, 'test1.loom')
$ h5ls -r R/library/LoomExperiment/extdata/L1_DRG_20_example.loom | tail
/matrix Dataset {20, 20}
/row_attrs Group
/row_attrs/Accession Dataset {20}
/row_attrs/Gene Dataset {20}
/row_attrs/X_LogCV Dataset {20}
/row_attrs/X_LogMean Dataset {20}
/row_attrs/X_Selected Dataset {20}
/row_attrs/X_Total Dataset {20}
/row_attrs/X_Valid Dataset {20}
/row_attrs/rownames Dataset {20}
$ h5ls -r test1.loom | tail -n 12
/matrix Dataset {20, 20}
/row_attrs Group
/row_attrs/GRangesList Group
/row_attrs/GRangesList/end Dataset {20, 1}
/row_attrs/GRangesList/group_name Dataset {20, 1}
/row_attrs/GRangesList/lengths Dataset {20, 1}
/row_attrs/GRangesList/names Dataset {20, 1}
/row_attrs/GRangesList/rownames Dataset {20, 1}
/row_attrs/GRangesList/seqnames Dataset {20, 1}
/row_attrs/GRangesList/start Dataset {20, 1}
/row_attrs/GRangesList/strand Dataset {20, 1}
/row_attrs/GRangesList/width Dataset {20, 1}
I think I might have found the cause here.
Condition in Line 226 will always evaluate to TRUE when the object is a SingleCellLoomExperiment
, therefore exporting only rowRanges
to /row_attrs
.
As SingleCellExperiment
is RangedSummarizedExperiment
, it makes sense for SingleCellLoomExperiment
to be RSE
too, but I guess we might want to export rowData
to /row_attrs
as well? Though this would have implications for importing. Alternatively, if packages dealing with SCE
don't write much in rowRanges, perhaps we can export only rowData
to /row_attrs
, whether it's a RSE
or not.
Thanks for the issue.
I've pushed changes that address this issue. What they do is prioritize gleaning rownames
and rowData names
from rowData
information as opposed to rowRanges
information when rowRanges
are not present.
The changes have been pushed to devel and should propagate within a day or so.
Thanks for the fix @dvantwisk! Is it possible to revert the dependency to R >=3.5.0? I think the changes are not dependent on the new R version, right? Doing so would free us from having to update all packages dependent on R version. I'll be happy to update the bioconda recipe for LoomExperiment when that's done. Many thanks!
I've changed the version back to R >= 3.5 on devel. I'll close this issue for now but message me back if any other pop up.
Hi @dvantwisk, thank you again for resolving all these issues! Would you be able to push changes related to #2 and #3 to release as well? This will make it possible to deploy them through bioconda, which only accepts "released" versions. I've tested all those changes (including those related to #4) and they all worked well.
Hi
Thanks for developing this package! I've just discovered it and it seems like a great way to allow interoperability with some of the Python scRNA-seq packages.
I have been playing around to see if I export some data and read it back in and I can't get the
rowData
(andrownames
) to survive the trip. Is this expected behaviour, a bug or am I just doing something wrong? Here is an example of what I have been trying:Any thoughts?