Highlights

Slideflow 2.2 further extends multiple-instance learning (MIL) capabilities, with the introduction of multi-magnification MIL, new models, experimental uncertainty quantification, and various other enhancements. This release also includes two new pretrained feature extractors - HistoSSL and PLIP - as well as support for the self-supervised learning framework, DINOv2. Slideflow Studio has been updated with several new features and quality of life improvements. Finally, the documentation has been enriched with Developer Notes and new tutorials, providing deeper insights on select topics.

Multi-Magnification MIL
New Feature Extractors a. Pretrained b. DINOv2
New MIL Features
Slideflow Studio Updates
Documentation Expansion
Other New Features
Version Requirements

Multi-Magnification MIL

Slideflow now supports multi-modal MIL, with feature bags generated from multiple feature extractors at different magnifications. Multi-magnification MIL offers potential advantages if there are valuable histologic features at both low and high magnification.

Working with multi-magnification MIL is easy - you can use the same training API as standard MIL models. Simply provide multiple bag paths (one for each magnification) and use the new "mm_attention_mil" model.

# Configure a multimodal MIL model.
config = mil_config('mm_attention_mil', lr=1e-4)

# Set the bags paths for each modality.
bags_10x = '/path/to/bags_10x'
bags_40x = '/path/to/bags_40x'

P.train_mil(
    config=config,
    outcomes='HPV_status',
    train_dataset=train,
    val_dataset=val,
    bags=[bags_10x, bags_40x]
)

Slideflow Studio also supports multi-magnification MIL models, allowing you to visualize attention and tile-level predictions from each mode separately.

New Feature Extractors

We've introduced support for two new pretrained feature extractors, as well as the self-supervised learning framework DINOv2.

Pretrained

The new pretrained feature extractors include:

HistoSSL: a pan-cancer, pretrained ViT-based iBOT model (iBOT[ViT-B]PanCancer). Paper
PLIP: feature encoder used for a CLIP model finetuned on pathology images and text descriptions. Paper

Licenses and citations are available for all feature extractors through the new .license and .citation attributes.

>>> ctranspath = sf.model.build_feature_extractor('ctranspath', tile_px=299)
>>> ctranspath.license
'GNU General Public License v3.0'
>>> print(ctranspath.citation)

@{wang2022,
  title={Transformer-based Unsupervised Contrastive Learning for Histopathological Image Classification},
  author={Wang, Xiyue and Yang, Sen and Zhang, Jun and Wang, Minghui and Zhang, Jing  and Yang, Wei and Huang, Junzhou  and Han, Xiao},
  journal={Medical Image Analysis},
  year={2022},
  publisher={Elsevier}
}

DINOv2

As with SimCLR, Slideflow now supports generating features from a trained DINOv2 model. Use the feature extractor 'dinov2', and pass the *.pth teacher weights to the argument weights, and the YAML configuration file to the argument cfg:

dinov2 = sf.model.build_feature_extractor(
  'dinov2',
  weights='/path/to/teacher_checkpoint.pth',
  cfg='/path/to/config.yaml'

We've also provided a modified version of DINOv2 that allows you to train the network using Slideflow projects and datasets. See our documentation for instructions on how to train and use DINOv2.

New MIL Features

Slideflow 2.2 includes a number of updates in MIL functionality, including:

A new fully-transformer model, "bistro.transformer" (from this paper)
New aggregation_level MIL config option. If any patients have multiple slides, use aggregation_level="patient". (#316)
MIL models can be further customized with a new argument mil_config(model_kwargs={...}). Use model_kwargs in the MIL configuration to pass through keyword arguments to the model initializer. This can be used, for example, to specify the softmax temperature and attention gating for the Attention_MIL model through the model keyword arguments temperature and attention_gate.
```
from slideflow.mil import mil_config

config = mil_config(
'attention_mil',
model_kwargs={'temperature': 0.3},
...
)
```
New experimental uncertainty quantification support for MIL models. Enable uncertainty estimation with the argument uq=True to either P.train_mil or P.evaluate_mil. The model must support a uq argument in it's forward() function. At present, MIL UQ is only available for the Attention_MIL model.
New class initializer DatasetFeatures.from_bags(), for loading a DatasetFeatures objects from previously generated feature bags. This makes it easier to perform latent space exploration and visualization, using DatasetFeatures.map_activations() (see docs)
New sf.mil.MILFeatures class to assist with calculating and visualizing last-layer activations from MIL models, prior to final logits. This class is analogous to the DatasetFeatures interface, but for MIL model layer activations.

Slideflow Studio Updates

The latest version of Slideflow Studio includes a number of usability improvements and new features.

ROI labels: You can now assign labels to Regions of Interest (ROIs) in Studio. These labels can be used for downstream strongly-supervised training, where labels are determined from ROIs rather than inherited from the slide label.
ALT hover: Press Left ALT while hovering over a heatmap to show the raw prediction/attention values beneath your cursor.
Progress bars: A progress bar is now displayed when generating predictions for a slide.
Tile predictions with MIL: Show tile-level predictions and attention for MIL models by right-clicking anywhere on a slide
GAN seed tracking: Keep track of GAN seeds with easy saving and loading. Quickly scroll through seeds by pressing left/right on your keyboard.
Scale sidebar icons with font size
Improved low-memory mode for MIL models (supports devices with < 8 GB RAM)
Preserve ROIs and slide settings when changing models
Default to Otsu's thresholding instead of grayspace filtering, for improved efficiency

Documentation Expansion

Documentation at https://slideflow.dev has been further expanded, and a new Developer Notes section has been added. Developer Notes are intended to provide a deeper dive into selected topics of interest for developers or advanced users. Our first developer notes include:

TFRecords: Reading and Writing: a detailed description of our TFRecord data format, with examples of how to create, inspect, and read from these files.
Dataloaders: Sampling and Augmentation: descriptions of how to create PyTorch DataLoaders or Tensorflow tf.data.Datasets and apply custom image transformations, labeling, and sampling. Includes a detailed examination of our oversampling and undersampling methods.
Custom Feature Extractors: A look at how to construct custom feature extractors for MIL models.
Strong Supervision with Tile Labels: An example of how Region of Interest (ROI) labels can be leveraged for training strongly-supervised models.

In addition to these new Dev Notes, we've also added two tutorials (Tutorial 7: Training with Custom Augmentations and Tutorial 8: Multiple-Instance Learning), as well as expanded our Slideflow Studio docs to reflect the latest features.

Other New Features

Align two slides together using sf.WSI.align_to(). This coarse alignment is fast and effective for slides in the proper orientation, without distortion. Use the more accurate sf.WSI.align_tiles_to() method for a higher accuracy alignment, finetuned at each tile location.
Rotate a whole-slide image upon initial load using the new transforms argument [Libvips only]. This is particularly useful when attempting to align slides:
```
wsi = sf.WSI(..., transforms=[sf.slide.ROTATE_90_CLOCKWISE])
```
Use OpenSlide bounding boxes, if present, with the new WSI argument use_bounds [Libvips only]. If True, will use existing OpenSlide bounding boxes. If a tuple, will crop the whole-slide image to the specified bounds. This is particularly useful when aligning slides.
```
# Use OpenSlide bounds
wsi = sf.WSI(..., use_bounds=True)

# Manually define the bounding box
wsi = sf.WSI(..., use_bounds=(41445, 112000, 48000, 70000))
```
New Indexable PyTorch dataset, for easier integration of Slideflow datasets into external projects
Improved progress bars during tfrecord interleaving
Add support for protobuf version 4
Train GANs at odd image sizes with the new resize argument (see docs)
Train a GAN conditioned on tile-level labels (see docs)

Version Requirements

Version requirements are largely unchanged. Notable differences include:

The new PLIP feature extractor requires the transformers package.
The TransMIL model requires the nystom_attention package.
protobuf version compatibility expanded to support version 4
imgui version increase to >= 2.0.0

jamesdolezal / slideflow

Version 2.2.0 #329