Closed le-ander closed 1 year ago
Could you please try the fix in the linked pull request? Could not run it but pretty sure it addresses this issue.
Thanks David!
This fixes the error, however a new one appeared (this is still from finalize-dataloader
).
Can you explain what this warning means?
ALso not sure we should really raise
this warning as the code aborts after raise
?
│ │
│ /lustre/groups/ml01/code/katelyn.li/sfaira/sfaira/data/dataloaders/export_adaptors/cellxgene.py:323 │
│ in cellxgene_export_adaptor_3_0_0 │
│ │
│ 320 │ # Suspension is a custom ontology and does not require an ID. The column itself must │
│ be categorical. │
│ 321 │ # Add this annotation here if it was not set before. │
│ 322 │ if np.all(adata.obs[adata_ids.suspension_type].values == │
│ adata_ids.unknown_metadata_identifier): │
│ ❱ 323 │ │ adata = match_supsension_and_efo(adata=adata, adata_ids=adata_ids, │
│ efo_ontology=get_ontology(k="assay_sc"), │
│ 324 │ │ │ │ │ │ │ │ │ │ valid_combinations=VALID_EFO_SUS["3_0_0"]) │
│ 325 │ adata.obs[adata_ids.suspension_type] = │
│ pd.Categorical(adata.obs[adata_ids.suspension_type].values.tolist()) │
│ 326 │ del adata.obs[adata_ids.suspension_type + adata_ids.onto_id_suffix] │
│ │
│ /lustre/groups/ml01/code/katelyn.li/sfaira/sfaira/data/dataloaders/export_adaptors/cellxgene.py:239 │
│ in match_supsension_and_efo │
│ │
│ 236 │ │ for k, v in valid_combinations.items(): │
│ 237 │ │ │ if efo_ontology.is_a(is_=efo, a_=k): │
│ 238 │ │ │ │ if efo in efo_map.keys(): │
│ ❱ 239 │ │ │ │ │ raise Warning(f"found multiple suspension matches for EFO {efo}.") │
│ 240 │ │ │ │ efo_map[efo] = v[0] │
│ 241 │ # Picks first match, ie prioritises cell if both cell and nucleus are possible. │
│ 242 │ adata.obs[adata_ids.suspension_type] = [ │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────╯
Warning: found multiple suspension matches for EFO EFO:0009922.
I did not realize this warning would abort the CLI, yes in that case we can move this to a print statement. In brief, suspension type is a new required meta data that can be automatically added but this is ambiguous in a number of cases, this was triggered here, we then randomly default on choice if it was not separately supplied by the user. Could you change the warning to a print statement and then continue? If you are planning on using suspension further down it would make sense to manually set it here, too.
Will try, thank you! what are the options for suspension type then? (this is 10x 3' v3)
"cell", "na", "nucleus"
, see also https://github.com/chanzuckerberg/single-cell-curation/blob/main/schema/3.0.0/schema.md#suspension_type
Fixed :+1:
The CLI command
finalize-dataloader
seems to be broken on dev. this seems to be related to the recent introduction of the new cellxgene schema. this looks like it's simple to fix but I don't fully grasp the cellxgene schema handling yet, so @davidsebfischer can you take a look?