Add tmi2022 GNN model - Githubissues

nadeemlab / SPT

Spatial profiling toolbox for spatial characterization of tumor immune microenvironment in multiplex images

https://oncopathtk.org

Other

21 stars 2 forks source link

Add tmi2022 GNN model #268

Closed CarlinLiao closed 9 months ago

CarlinLiao commented 11 months ago

A recently released paper includes a new GNN-Transformer architecture that may work better than the CG-GNN currently included in our dataset.

CarlinLiao commented 11 months ago

This GNN-Transformer model demands different data types than the current CG-GNN model does. Should handling that difference be the domain of the transformer code or CG-GNN?

Also, I recently learned of the git submodule feature, which allows nesting of different repositories in a single directory. Given that when we incorporate new GNN model types into SPT, it'll usually require depending on research repositories not published to PyPI, I think this could be a good approach to manage usage of external codebases.

We could try it out first by converting cg-gnn from a manual inclusion to a git submodule to see how well it works.

sanadeem commented 11 months ago

It should be the domain of CG-GNN. Variants such as transformers or the more recent cooperative message passing modules (https://arxiv.org/abs/2310.01267) that sit on top of the vanilla GNN should be agnostic to the data type.

CarlinLiao commented 11 months ago

Sorry, I meant SPT (although you could say that since SPT has subsumed CG-GNN it's all the same). I feel like it could set a bad precedent if it's the responsibility of SPT to form its outputs to match any odd cell graph model that wants to read from it, as opposed to having the cell graph model write its own DataLoader to parse SPT output.

CarlinLiao commented 10 months ago

I've developed a plugin that trains a tmi2022 model to create class activation maps / importance scores, so now it's left to integrate it into SPT. That will involve

creating a workflow that runs tmi2022, and another test that runs out of the typical Docker scaffolding to go along with it
- 289 is optional but very helpful for this
updating the database to distinguish between different methods by which importance scores were created in addition to the "cohort stratifier"
creating a new apiserver endpoint that returns importance scores created by tmi2022 (to complement the one from cg-gnn)

There are also some elements tracked outside of this repo, like

new UI elements to go along with the new server endpoint
a heat map of important cells as they relate to different phenotypes

CarlinLiao commented 10 months ago

I just remembered that tmi2022 requires a GPU to run. For this reason, I won't be adding a test for it.

CarlinLiao commented 10 months ago

A refinement of the goals listed above:

Generalize cg-gnn workflow to support any canonical SPT graph plugin (i.e., spt-graph-transformer)
Add richer identification information when uploading importance scores, including plugin used, version, and date time of run
Allow for fetching importance scores based on permutations of the above identifiers in the API And of course modifying current tests to support above refactors and adding new tests where possible

nadeemlab / SPT

Add tmi2022 GNN model #268

289 is optional but very helpful for this