haniffalab / webatlas-pipeline

A data pipeline built in Nextflow to process spatial and single-cell experiment data for visualisation in WebAtlas
MIT License
40 stars 7 forks source link

Not able to see celltypes after processing the multimodal data with the multimodal run #139

Closed ashishjain1988 closed 4 days ago

ashishjain1988 commented 3 weeks ago

Hi,

I am currently trying to combine my processed Visium and scRNAseq data using the multimodal run. Following the tutorial I was able to process the data and was able to see the combine feature (Gene list) but I am not able to see the combined cell type on webatlas-portal. Here is my multimodal config file that I used to process the data. image

url: http://localhost:3000/p262_Multimodal_1/0.5.2/ project: p262_Multimodal title: "Gut data" description: "" outdir: ./output/P262_Multimodal_1

data:

I also tried to rename the cell annotations an try to use extend_feature_name but then I was not able to see even the overlapped gene list. Can you please help me to fix this?

Regards, Ashish Jain

ashishjain1988 commented 2 weeks ago

Hi @BioinfoTongLI and @dannda

I will really appreciate your input on this issue that I am facing while working on the multi-omics processing webatlas pipeline.

Zhuang-Bio commented 1 week ago

I also met this problem when running the multimodal to combine multiple sections of 10X visum data. Have you tried to set the sort to False? `extend_feature:

  path: ./input/STseq_scanpy_cleanSpots_allData.h5ad
  args:
    sample: ["library_id", "Donor2_Wound1"]
    sort: False # The deconvoluted cell type will not be loaded if keeping the default as True. A bit Weired.`
ashishjain1988 commented 1 week ago

Hi @Zhuang-Bio

Thank you for the suggestion. I am trying it but now my params file is not able to parse. I don't have a cell2location file for spot annotation but I did add a new metadata with the spot annotation as "RCTD_Prediction". Below is the new multi modal config file that was not able to parse. Thank you for your help!

url: http://localhost:3000/p262_Multimodal_1/0.5.2/
project: p262_Multimodal
title: "Gut data"
description: ""
outdir: ./output/P262_Multimodal_1

data:
  - dataset: scrnaseq_demo
    obs_type: "cell"
    anndata: ./p262_output/0.5.2/p262-scRNAseq-anndata.zarr/
    extend_feature: obs/cell.idents.L1
      args:
        sort: False
    offset: 0
    is_spatial: false
    vitessce_options:
      spatial:
        xy: "obsm/spatial"
      mappings:
        obsm/X_umap: [0, 1]
      matrix: "X"
  - dataset: visium_demo
    obs_type: "spot"
    anndata: ./output/T604_WebAtlas_h5ad/0.5.2/visium-heart-disease-anndata.zarr/
    extend_feature: obs/RCTD_Prediction
      args:
        sort: False
    offset: 1000000
    is_spatial: true
    raw_image: ./output/T604_WebAtlas_h5ad/0.5.2/visium-heart-disease-raw.zarr/
    label_image: ./output/T604_WebAtlas_h5ad/0.5.2/visium-heart-disease-label.zarr/
    vitessce_options:
      spatial:
        xy: "obsm/spatial"
      matrix: "X"
dannda commented 1 week ago

hi @ashishjain1988 , could you please also share the config json file the pipeline generated? wondering if because the features are named differently in each modality the output json file is pointing at wrong locations could you also confirm how you tried using the extend_feature_name param? if the different names are causing issues this would ideally fix it but may not be working as expected

as a sanity check as well, could you confirm in the generated zarr directories there is a cell.idents.L1 directory within obs for the single cell dataset, and a RCTD_Prediction within obs in the visium dataset ?

ashishjain1988 commented 1 week ago

Hi @dannda

Please find attached the output json file after the webatlas-pipeline multi omics run. I also checked the generated zarr directories and both of them have the respective meta datas (also in the final zarr directory after multi omics run). p262_Multimodal-multimodal-config.json

For the extend_feature_name param, I created a new metadata in the visum dataset with name cell.idents.L1 and use that as the extend_feature_name as it is also present in the scRNASeq data.

dannda commented 5 days ago

Thanks @ashishjain1988 I'm trying to figure out where exactly the issue is occurring as I think it is params not being passed properly through the different functions, given that you're getting null as the feature name In the meantime, could you try loading with this json instead? I only manually changed where it stated null as the feature, like "featureFilterPath": "var/is_null" to the actual names from your data (eg "featureFilterPath": "var/is_cell.idents.L1") in case they got written correctly and it was only the config generation that is faulty

ashishjain1988 commented 5 days ago

Hi @dannda ,

Thank you for the updated file! It worked! I am now able to see the intersected cell types. However, can we also show all the cell types on webatlas in the list (eg. cell types that are specific to one omics type)?

dannda commented 4 days ago

Glad to hear that! On showing all celltypes of one omics type, that can't be done, unfortunately, the multimodal pipeline will compute the intersection so missing data in one modality doesn't come across misleadingly as not expressed/present You could manually force this if you add the data to the modality that does not have it before putting it through the pipeline. Note that for the selection to work correctly across modalities they do need to have the same features even if they are filled with 0s in one of them. If one of them has more or less features, then selection will display the wrong data on them. Or having a separate non-multimodal visualisation for that specific modality only to display its full data

ashishjain1988 commented 4 days ago

Thank you for the help!