scverse / spatialdata-io

BSD 3-Clause "New" or "Revised" License
43 stars 26 forks source link

Visium_HD errors #218

Open jwweii opened 1 month ago

jwweii commented 1 month ago

I've encountered two issues when using the package:

  1. Missing Sample Prefix in Output Files: Our output files do not include a sample prefix, which makes an error when reading the feature_slice.h5. For example, the output looks like this:
├── binned_outputs
│   ├── square_002um
│   ├── square_008um
│   └── square_016um
├── cloupe_008um.cloupe -> binned_outputs/square_008um/cloupe.cloupe
├── feature_slice.h5
├── metrics_summary.csv
├── molecule_info.h5
├── possorted_genome_bam.bam
├── possorted_genome_bam.bam.bai
├── probe_set.csv
├── spatial
│   ├── aligned_fiducials.jpg
│   ├── aligned_tissue_image.jpg
│   ├── cytassist_image.tiff
│   ├── detected_tissue_image.jpg
│   ├── tissue_hires_image.png
│   └── tissue_lowres_image.png
└── web_summary.html

To resolve this, I had to manually modify the file visium_hd.py, line 96, to include the sample prefix by changing the code to: filename_prefix = f"{dataset_id}_" if dataset_id else "" (I input dataset_id as "" in the argument)

  1. Non-Unique Variable Names: I encountered an issue where the variable names in the AnnData object were not unique. This caused conflicts in the downstream analysis. To address this, I added the following line after loading the data with sc.read_10x_h5: adata.var_names_make_unique()
LucaMarconato commented 1 month ago

Hi @jwweii thanks for reporting.

Regarding the empty prefix problem, this has been reported here: https://github.com/scverse/spatialdata-io/issues/212 and a solution has been developed from one of the user, and should be soon available in a PR.

Regarding the var_names_make_unique(), we are aware of this and indeed calling the function is the way to go. But I understand that this could create confusion, so I will: