Originally posted by **gnodar01** November 30, 2023
Over the last week or so [we've been discussing](https://github.com/broadinstitute/superurop-log/discussions/7#discussioncomment-7579909) going forward with composable models. The rough idea is to have individual components of an instance segmentation model, that are pre-trained on some task such as imagenet or coco. One component might be an object detector (probably single-shot such as yolo or ssd) which outputs boundbox coordinates (and classes, but we may not even need those). Another component might be a semantic segmenter, which takes in crops defined by the bounding boxes of the object detector, and semantically segments into bg/fg.
The hypothesis is, given pretrained components of the sort described above, can we add on a relatively slim number of layers on each, to have a fine-tunable model?
If so, we can have a composable pipline that goes `images -> object detector (pretrained, inference only) -> slim layers (trainable) -> output bbox coords -> image crops -> semantic segmenter (pretrained, inference only) -> slim layers (trainable) -> instances`
In the end this pipeline would be deployed to the browser.
The object detector and semantic segmenter would be pretrained in python on a laptop or DGX, and then converted to a tfjs graph model. We have determined that is possible with more or less arbitrary tensorflow using [tfjs convertor](https://github.com/tensorflow/tfjs/tree/master/tfjs-converter) or pytorch models using the [ultralytics Exporter class](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/engine/exporter.py#L127) (which does `pytorch -> onnx -> tensorflow saved model -> tfjs converter -> tfj graph model`).
The slim layers would be tfjs layers models, and therefore trainable in-browser.
For the purposes of validating the hypothesis first however, we'll do it all in python, as a proof of concept. We can worry about converting to tfjs and verifying its performance in the browser after.
Discussed in https://github.com/broadinstitute/superurop-log/discussions/8