drieslab / Giotto

Spatial omics analysis toolbox
https://drieslab.github.io/Giotto_website/
Other
258 stars 98 forks source link

Nanostring CosMx and AtoMx exported data #785

Open jmodlis opened 11 months ago

jmodlis commented 11 months ago

Discussed in https://github.com/drieslab/Giotto/discussions/783

Originally posted by **jmodlis** October 25, 2023 Hello, Thank you for all your great work on this package! We have been using it extensively on some older Nanostring CosMx datasets. We are now working with Nanostring CosMx data that has been minimally processed and exported from the Nanostring AtoMx analysis suite. Unfortunately, the exported data structure from AtoMx is quite different from what the Giotto function `createGiottoCosMxObject` expects. Are there any plans to accommodate this new structure into Giotto functions? I'm working on work-arounds myself but thought it would be good to see if others were facing this problem as well.
feinan-technion commented 6 months ago

Hi @jmodlis, (I think Giotto developers can also benefit from my answer- @jiajic .)

I've been working with Giotto for quite a while with AtoMx exported data and I think I have a list of "work-arounds" that need to be performed to import the CosMx data as it is exported from AtoMx. I'll list them here:

  1. The 1st step is to organize a folder with all the files expected by createGiottoCosMxObject. a. To that end, you would have to re-create the "CellComposite_F###.jpg" files after downloading the data using the AtoMx Export module with the "exportFOVImages" parameter checked. Currently, the Export modules creates a "CellComposite" directory with empty JPG files within the exported directory and soon, in future updates, the "CellComposite" sub-folder will be removed entirely. To create these files, you can use the ".../CellStatsDir/Morphology2D" TIFF files that should also exist in the export from AtoMx. I've writen an ImageJ macro to generate them without additional modifications ( create_composite.ijm.gz ). Though, if you want to edit the images, you might want to change the script or create the Cell-Composite images manually. b. After creating a "CellComposite" directory in the same folder as where the Morphology2D sub-folder was found, you might want to organize the directories in the same structure as the CosMx demo data. I created a python script for that also ( create_giotto_folder.py.gz ) though, this can be performed manually as well.
  2. Next, there's a change in the "_fov_positions_file.csv" that needs to be fixed. While in the CosMx lung demo data, the FOV shifts are reported in "_global_px" units, in the AtoMx export, the shift are reported in "mm". To change that, there are many ways you can derive the conversion factor but I got an instruction to use the "ImPixelnm" parameter found in the ".../RunSummary/Run[GUID]_ExptConfig.txt" file in the AtoMx export (with "exportFOVImages" parameter checked). For convenience, I wrote an R script ( convert_fov_position_file.R.gz ) that creates a new FOV file with converted values. You should replace the FOV file with the new one under the same name in the Giotto data directory.
  3. Afterwards, you should be set load the data into a Giotto object. I created a function similar to createGiottoCosMxObject based on the Giotto Nanostring vignette which accommodates the existence of Negative probe features, False code features and gene features in the transcript table and creates an object with multiple FOVs joined and every feature set in a separate "feat_info" field ( read_atomx_into_giotto.R.gz ).
  4. Another issue with reading the AtoMx data into a Giotto objects relates to the resetting of the cell number identifiers which occurs. In the "CellLabels_F###.tif" files, the cell segmentations intensity values represent the cell IDs. The cell ID numbers correspond to identifiers used in the metadata file, as well as other potential files exported from AtoMx. Therefore, it's important to keep the identifiers as they are in the TIFF files. After the realizing the identifier are reset and do not match the identifiers in the metadata CSV file, I traced the issue to the function createGiottoObjectSubcellular which calls the internal function extract_polygon_list which calls the function createGiottoPolygonsFromMask. The function where the resetting of cell IDs is createGiottoPolygonsFromMask. The problematic line is: else { terra_polygon$poly_ID = paste0(name, "_", 1:nrow(terra::values(terra_polygon))) } Because the function itself calls several internal Giotto functions, I couldn't reset the function within the namespace without changing internal function call lines as well (e.g., from identify_background_range_polygons to Giotto:::identify_background_range_polygons). I'm attaching my suggestion for changes to createGiottoPolygonsFromMask to keep original cell segment intensities as cell IDs ( createGiottoPolygonsFromMask_edited.R.gz ). Although, I don't know if these changes inhibit other usages of this function within the Giotto analysis.

I hope this helps, Einan

RubD commented 6 months ago

Thanks @feinan-technion I'm also tagging @iqraAmin here who could potentially also help streamline this process.

feinan-technion commented 6 months ago

Hi again,

Under carefull examination and comparison of Cell IDs and Cell Annotation between the Giotto object and AtoMx platform, I realized that even with my suggested changes to the createGiottoPolygonsFromMask, there's still a difference in some of the FOVs in the Cell IDs. Going over it I think I found the problem. I'm attaching a newer version of the files I shared regarding reading CosMx data into Giotto objects:

However, my changes in the code are quite rough. There's probably a more elegant solution.

jiajic commented 6 months ago

@feinan-technion this is great, thank you! @iqraAmin and I are reworking createGiottoCosMxObject(), and these are really helpful for figuring out what needs to be done.

We are still a bit foggy on the overall structure of the AtoMx outputs. For issue 1, is the main feature needed a way to set a specific directory to import images from with createGiottoCosMxObject()?

To add onto issue 4, we have recently made some changes to createGiottoPolygonsFromMask() in GiottoClass >= 0.2.2 that should ensure that the IDs are now pulled from the encoded mask values. There is also an ID_fmt param that allows facilitates either paste0 or sprintf formatting of the resulting ID values. We will check to make sure that we have parity with your edits.