mojaveazure / seurat-disk

Interfaces for HDF5-based Single Cell File Formats
https://mojaveazure.github.io/seurat-disk
GNU General Public License v3.0
155 stars 49 forks source link

Merging two .h5seurat files containing ATAC assays renders them unsaveable using SaveH5Seurat #140

Open volkansevim opened 1 year ago

volkansevim commented 1 year ago

I want to integrate multiple 10x multiome runs (rna+atac), following a procedure similar to the Singac integration vignette.

Each of my runs is saved as a separate .h5seurat file. Merging the ATAC objects in these files, and trying to save the merged object leads to a SaveH5Seurat failure:

. SaveH5Seurat(dummy_merge, \"DUMMY_multi.h5seurat\", overwrite = T, . verbose = T)

  1. SaveH5Seurat.Seurat(dummy_merge, \"DUMMY_multi.h5seurat\", overwrite = T, . verbose = T) # at line 31 of file /home/sandbox/seurat-disk/R/SaveH5Seurat.R
  2. as.h5Seurat(x = object, filename = filename, overwrite = overwrite, . verbose = verbose, ...) # at line 97-103 of file /home/sandbox/seurat-disk/R/SaveH5Seurat.R
  3. as.h5Seurat.Seurat(x = object, filename = filename, overwrite = overwrite, . verbose = verbose, ...) # at line 41 of file /home/sandbox/seurat-disk/R/SaveH5Seurat.R
  4. WriteH5Group(x = x[[assay]], name = assay, hgroup = hfile[[\"assays\"]], . verbose = verbose) # at line 210-215 of file /home/sandbox/seurat-disk/R/SaveH5Seurat.R
  5. WriteH5Group(x = x[[assay]], name = assay, hgroup = hfile[[\"assays\"]], . verbose = verbose) # at line 177 of file /home/sandbox/seurat-disk/R/WriteH5Group.R
  6. WriteH5Group(x = slot(object = x, name = slot), name = slot, . hgroup = xgroup, verbose = verbose) # at line 330-335 of file /home/sandbox/seurat-disk/R/WriteH5Group.R
  7. WriteH5Group(x = slot(object = x, name = slot), name = slot, . hgroup = xgroup, verbose = verbose) # at line 177 of file /home/sandbox/seurat-disk/R/WriteH5Group.R
  8. WriteH5Group(x = x[[i]], name = names(x = x)[i], hgroup = xgroup, . verbose = verbose) # at line 29-34 of file /home/sandbox/seurat-disk/R/WriteH5Group.R
  9. WriteH5Group(x = x[[i]], name = names(x = x)[i], hgroup = xgroup, . verbose = verbose) # at line 177 of file /home/sandbox/seurat-disk/R/WriteH5Group.R
  10. hgroup$create_group(name = name) # at line 198 of file /home/sandbox/seurat-disk/R/WriteH5Group.R" }

You can reproduce it following these steps:

  1. Save the two Seurat objects from Signac integration vignette using SaveH5Seurat.
  2. Load and merge them.
  3. Save the merged object using SaveH5Seurat.
SaveH5Seurat(pbmc.combined, "DUMMY_combined.h5seurat", overwrite=T) 
SaveH5Seurat(pbmc.multi, "DUMMY_multi.h5seurat", overwrite=T, verbose = F)
dummy_obj = LoadH5Seurat("DUMMY_combined.h5seurat", verbose = F)
dummy_multi_obj = LoadH5Seurat("DUMMY_multi.h5seurat", verbose = F)
dummy_merge = merge(dummy_multi_obj, dummy_obj)
SaveH5Seurat(dummy_merge, "DUMMY_multi.h5seurat", overwrite=T, verbose=T)

I think the problem is that there are two fragment objects with the name 'index1'. SaveH5Seurat fails when writing those objects:

str(dummy_merge[['peaks']]@fragments)

List of 3 $ index1:Formal class 'Fragment' [package "Signac"] with 3 slots .. ..@ path : chr "/pbmc_multiome/pbmc_granulocyte_sorted_10k_atac_fragments.tsv.gz" .. ..@ hash : chr [1:2] "a959ef83dfb9cae6ff73ab0147d547d1" "df967acbe28da89aed9cfdd89370b7af" .. ..@ cells: Named chr(0) .. .. ..- attr(, "names")= chr(0) $ index1:Formal class 'Fragment' [package "Signac"] with 3 slots .. ..@ path : chr "/pbmc_atac/atac_pbmc_10k_nextgem_fragments.tsv.gz" .. ..@ hash : chr [1:2] "3345d40e136a430469569e167173bfc6" "e969074b96034c5c8fe0505cf4233a0d" .. ..@ cells: Named chr [1:9788] "TACCTATGTGATTCCA-1" "GCGGAAAGTCAGCAAG-1" "CGTACAAAGAATACTG-1" "TGCATTTGTCAGAAGC-1" ... .. .. ..- attr(, "names")= chr [1:9788] "TACCTATGTGATTCCA-1_2" "GCGGAAAGTCAGCAAG-1_2" "CGTACAAAGAATACTG-1_2" "TGCATTTGTCAGAAGC-1_2" ... $ index2:Formal class 'Fragment' [package "Signac"] with 3 slots .. ..@ path : chr "/pbmc_multiome/pbmc_granulocyte_sorted_10k_atac_fragments.tsv.gz" .. ..@ hash : chr [1:2] "a959ef83dfb9cae6ff73ab0147d547d1" "df967acbe28da89aed9cfdd89370b7af" .. ..@ cells: Named chr [1:11331] "AAACAGCCAAGGAATC-1" "AAACAGCCAATCCCTT-1" "AAACAGCCAATGCGCT-1" "AAACAGCCACACTAAT-1" ... .. .. ..- attr(*, "names")= chr [1:11331] "AAACAGCCAAGGAATC-1_2" "AAACAGCCAATCCCTT-1_2" "AAACAGCCAATGCGCT-1_2" "AAACAGCCACACTAAT-1_2" ...