SpatialHackathon / SpaceHack2023

MIT No Attribution
16 stars 3 forks source link

Optargs Spaceflow #209

Closed Jieran-S closed 9 months ago

Jieran-S commented 9 months ago

Bugs line:

https://github.com/SpatialHackathon/SpaceHack2023/blob/2e81727c99ddd5171d403915834d1474820384d2/method/SpaceFlow/method_spaceflow.py#L143-L145

Background:

In Spaceflow workflow, they include both scaled data and PCA in their preprocessing pipeline. But later stage it only uses scaled data representation as far as I can tell.

See workflow (and specifically the preprocessing and following steps) below: https://github.com/hongleir/SpaceFlow/blob/58d1ab03f1538791f710db840d36a08f396491fb/SpaceFlow/SpaceFlow.py#L110-L135

My question:

  1. Is the PCA used in any following workflow (check their source code above)? I can't find any and submitted an issue.

  2. Is the optargs file correct? (see below) https://github.com/SpatialHackathon/SpaceHack2023/blob/2e81727c99ddd5171d403915834d1474820384d2/method/SpaceFlow/spaceflow_optargs.json#L1-L7

shdam commented 9 months ago

Another question is whether to remove the resolution from the config and implement a resolution optimizer in the code instead?

Regarding using PCA, I also cannot find any signs of it being used. Try removing it and do a test run. I would say the optargs should be "matrix": "transform",, as the preprocessed counts are definitely used, which, as far as I understood, are not passed when dimred is passed.

Jieran-S commented 9 months ago

The resolution optimizer is a good point! I forgot to mention it but def agree. I put matrix as transform and tested it ytd (update_args branch). It can produce results in proper format…maybe we can compare ARI in the future to see if a PCA is necessary. :) Thanks for checking!!