Closed GabrielHoffman closed 2 years ago
This turned out to be a fairly simple fix so should be working now in release (v1.6.2) and devel (v1.7.2)
The new version doesn't crash, but it also doesn't import the data. The data appears to have 0 rows:
> sce = readH5AD(file, use_hdf5=TRUE)
Warning message:
The names of these selected obs columns have been modified to match R conventions: '3'_or_5'' -> 'X3._or_5.'
> sce
class: SingleCellExperiment
dim: 0 2232536
metadata(1): ann_level_5_colors
assays(1): X
rownames(0):
rowData names(0):
colnames(2232536): TTAGGACTCTGCCCTA-WSSS8062679_meyer
CCTATTAGTATAATGG_IPF1_tsukui ... ACGGCCATCCACTGGG_9_liao
P3_6_TGAGCCGAGTGTCCCG
colData names(82): sample study_long ...
ext_transf_ann_level_5_thresholded ext_transf_uncert_level_5
reducedDimNames(2): X_scanvi_emb X_umap
mainExpName: NULL
altExpNames(0):
> assay(sce,1)
<0 x 2232536> sparse matrix of class DelayedMatrix and type "double"
Ah, that's definitely not what I intended. I clearly didn't look closely enough at the output. I think I know what the issue is so should be a quick fix. Thanks for letting me know!
Actually I think this was correct. The file you are using doesn't contain any expression data so the dimensions are 2232536 cells by 0 features. This is what I get if I just load it in Python:
AnnData object with n_obs × n_vars = 2232536 × 0
obs: 'sample', 'study_long', 'study', 'last_author_PI', 'subject_ID', 'sex', 'ethnicity', 'smoking_status', 'BMI', 'condition', 'subject_type', 'sample_type', 'single_cell_platform', "3'_or_5'", 'sequencing_platform', 'cell_ranger_version', 'fresh_or_frozen', 'dataset', 'anatomical_region_level_1', 'anatomical_region_level_2', 'anatomical_region_level_3', 'anatomical_region_highest_res', 'age', 'ann_highest_res', 'n_genes', 'size_factors', 'log10_total_counts', 'mito_frac', 'ribo_frac', 'original_ann_level_1', 'original_ann_level_2', 'original_ann_level_3', 'original_ann_level_4', 'original_ann_level_5', 'original_ann_nonharmonized', 'scanvi_label', 'leiden_1', 'leiden_2', 'leiden_3', 'anatomical_region_ccf_score', 'entropy_study_leiden_3', 'entropy_dataset_leiden_3', 'entropy_subject_ID_leiden_3', 'entropy_original_ann_level_1_leiden_3', 'entropy_original_ann_level_2_clean_leiden_3', 'entropy_original_ann_level_3_clean_leiden_3', 'entropy_original_ann_level_4_clean_leiden_3', 'entropy_original_ann_level_5_clean_leiden_3', 'leiden_4', 'reannotation_type', 'leiden_5', 'ann_finest_level', 'ann_level_1', 'ann_level_2', 'ann_level_3', 'ann_level_4', 'ann_level_5', 'ann_coarse_for_GWAS_and_modeling', 'core_ann_level_1', 'core_ann_level_2', 'core_ann_level_3', 'core_ann_level_4', 'core_ann_level_5', 'cells_or_nuclei', 'HLCA_core_or_extension', 'anatomical_region_coarse_unharmonized', 'anatomical_region_detailed_unharmonized', 'ext_transf_ann_level_1', 'ext_transf_ann_level_1_thresholded', 'ext_transf_uncert_level_1', 'ext_transf_ann_level_2', 'ext_transf_ann_level_2_thresholded', 'ext_transf_uncert_level_2', 'ext_transf_ann_level_3', 'ext_transf_ann_level_3_thresholded', 'ext_transf_uncert_level_3', 'ext_transf_ann_level_4', 'ext_transf_ann_level_4_thresholded', 'ext_transf_uncert_level_4', 'ext_transf_ann_level_5', 'ext_transf_ann_level_5_thresholded', 'ext_transf_uncert_level_5'
uns: 'ann_level_5_colors'
obsm: 'X_scanvi_emb', 'X_umap'
There is another version of the dataset that contains expression data if you need that.
Here is an error report for a special case, but I figure it is part of a more general issue...
I look a quick look at the Human Lung Cell Atlas from the Theis group, but I had an issue loading it with
zellkonverter
:URL: https://beta.fastgenomics.org/datasets/detail-dataset-427f1eee6dd44f50bae1ab13f0f3c6a9#Files
File: HLCA_v1_extended_no_counts.h5ad
Error:
There error is the same for
zellkonverter 1.7.0 + R 4.2.0
andzellkonverter 1.6.0 + R 4.1.0
.Traceback: