tangy5 commented 1 year ago

Contrastive Language-Image Pre-training (CLIP) Driven Models and Partially Supervised Learning for Medical Image Segmentation

This issue is to discuss adding the CLIP-Driven Universal Model Features to MONAI.

Potential assignee: @tangy5

CLIP-Driven Universal Model

The implementation will bring several new feature as follows:

Universal Model: one model to detect and segment all abdominal organs and all types of tumors (Liver tumor, kidney tumor, Lung nodule, Pancreas tumor, hepatic vessel tumor, colon tumor).
Language model (CLIP) and text-driven embeddings boost medical image analysis.
Training Partial labelled datasets.
Incremental learning: Users can continue to train new segmentation classes using the current trained model without catastrophic forgetting.

Screenshot from 2023-01-03 13-16-57

[ ] Transformations (pre-processing) for partial labelled datasets: “PartialLabelTransfer”, etc
[ ] Segmentation backbone with CLIP embedding, text-driven segmentor: plug-and-play CLIP embedding and text encoder.
[ ] Tutorial for training and inference of Universal Model.
[ ] Tutorial for demonstrating partial supervised learning and incremental learning.
[ ] Model release: Bundle for Model Zoo for publishing the trained universal model to segment all types of tumours and abdominal organs.