nadeemlab / SPT

Spatial profiling toolbox for spatial characterization of tumor immune microenvironment in multiplex images
https://oncopathtk.org
Other
21 stars 2 forks source link

Add tmi2022 GNN model #268

Closed CarlinLiao closed 9 months ago

CarlinLiao commented 11 months ago

A recently released paper includes a new GNN-Transformer architecture that may work better than the CG-GNN currently included in our dataset.

CarlinLiao commented 11 months ago

This GNN-Transformer model demands different data types than the current CG-GNN model does. Should handling that difference be the domain of the transformer code or CG-GNN?

Also, I recently learned of the git submodule feature, which allows nesting of different repositories in a single directory. Given that when we incorporate new GNN model types into SPT, it'll usually require depending on research repositories not published to PyPI, I think this could be a good approach to manage usage of external codebases.

We could try it out first by converting cg-gnn from a manual inclusion to a git submodule to see how well it works.

sanadeem commented 11 months ago

It should be the domain of CG-GNN. Variants such as transformers or the more recent cooperative message passing modules (https://arxiv.org/abs/2310.01267) that sit on top of the vanilla GNN should be agnostic to the data type.

CarlinLiao commented 11 months ago

Sorry, I meant SPT (although you could say that since SPT has subsumed CG-GNN it's all the same). I feel like it could set a bad precedent if it's the responsibility of SPT to form its outputs to match any odd cell graph model that wants to read from it, as opposed to having the cell graph model write its own DataLoader to parse SPT output.

CarlinLiao commented 10 months ago

I've developed a plugin that trains a tmi2022 model to create class activation maps / importance scores, so now it's left to integrate it into SPT. That will involve

  1. creating a workflow that runs tmi2022, and another test that runs out of the typical Docker scaffolding to go along with it
    • 289 is optional but very helpful for this

  2. updating the database to distinguish between different methods by which importance scores were created in addition to the "cohort stratifier"
  3. creating a new apiserver endpoint that returns importance scores created by tmi2022 (to complement the one from cg-gnn)

There are also some elements tracked outside of this repo, like

CarlinLiao commented 10 months ago

I just remembered that tmi2022 requires a GPU to run. For this reason, I won't be adding a test for it.

CarlinLiao commented 10 months ago

A refinement of the goals listed above: