Keras.io examples conversion gameplan

fchollet commented 1 year ago

We need to convert keras.io examples to work with Keras 3.

This involves two stages:

Stage 1: tf.keras backwards compatibility check

Keras 3 is intended as a drop-in replacement for tf.keras. We expect most examples to work with no code changes other changing the imports (when using TF as the backend). So the first thing to do with a keras.io example is:

Install Keras 3 (via a git clone followed by python pip_build.py --install).
Run the example. In many cases this involves some manual steps around data downloads and dependency installation. Also, in many cases you should also edit hyperparameters to reduce compute intensiveness in order to be able to debug quickly (e.g. set epochs=1 and steps_per_epoch=3, that sort of thing).
Record anything that doesn't work out of the box and file issues on the Keras 3 GitHub accordingly. We will use these issues to improve the degree of compatibility of Keras 3 going forward. Try to work around each issue you find, so that you can reach the next issue.
Open a PR to commit your converted examples in examples/keras_io/tensorflow/. PLEASE INCLUDE THE GIT DIFF (diff from the original example to the new file) in the PR description.

Note: in some cases this conversion will not be possible. There is some niche functionality that we removed from Keras 3, such as add_metric. When hitting such problems, if unable to work around the issue, simply record the problem in a GitHub issue and move on.

Stage 2: backend-agnostic conversion

Going one step further, once an example runs with the TF backend, we should seek to replace all TF APIs in the example with backend-agnostic keras.ops APIs.

In some cases this is not possible. Keras Core does not have backend-agnostic capabilities for custom train_step or custom training loops. In such cases, you should convert what is convertible, and then fork the example into 2 separate versions: a TF one and a JAX one, using APIs from each framework to implement the low-level functionality.

Keep in mind that it's ok to use TF APIs for data I/O and preprocessing. We only aim to convert modeling and training APIs -- all data preprocessing can stay as-is even if it uses TF. TF is the only feature-complete framework when it comes to data preprocessing, and generally the only viable option for many use cases.

Note on Keras preprocessing layers and tf.data: you can't use Keras 3 preprocessing layers in a tf.data pipeline when using a backend that is not TF. As a result, if you need to use Keras preprocessing layers in tf.data, import them from tf.keras.

Once you have converted an example to use backend-agnostic APIs and run with JAX and TF, open a PR to commit it:

in examples/keras_io/tf/ (it should replace the existing one if it is there) or examples/keras_io/jax/ if it's backend-specific.
in examples/keras_io/ if it's backend-agnostic.

Let's go!

Assignment - stage 1: conversion to Keras 3 with TF backend

CV

[x] Image classification from scratch: @fchollet
[x] Simple MNIST convnet: @fchollet
[x] Image classification via fine-tuning with EfficientNet @divyashreepathihalli
[ ] Image classification with Vision Transformer
[ ] Image Classification using BigTransfer (BiT)
[x] Classification using Attention-based Deep Multiple Instance Learning: @hertschuh
[x] Image classification with modern MLP models @divyashreepathihalli
[ ] A mobile-friendly Transformer-based model for image classification
[ ] Pneumonia Classification on TPU
[x] Compact Convolutional Transformers
[x] Image classification with ConvMixer
[x] Image classification with EANet (External Attention Transformer)
[x] Involutional neural networks: @hertschuh
[x] Image classification with Perceiver @divyashreepathihalli
[ ] Few-Shot learning with Reptile
[x] Semi-supervised image classification using contrastive pretraining with SimCLR @grasskin
[x] Image classification with Swin Transformers @grasskin
[ ] Train a Vision Transformer on small datasets
[x] A Vision Transformer without Attention
[x] Image segmentation with a U-Net-like architecture @sampathweb
[ ] Multiclass semantic segmentation using DeepLabV3+
[ ] Object Detection with RetinaNet
[ ] Keypoint Detection with Transfer Learning
[x] Object detection with Vision Transformers @grasskin
[x] OCR model for reading Captchas @grasskin
[ ] Handwriting recognition
[x] Convolutional autoencoder for image denoising: @fchollet
[x] Low-light image enhancement using MIRNet
[ ] Image Super-Resolution using an Efficient Sub-Pixel CNN
[ ] Enhanced Deep Residual Networks for single-image super-resolution
[x] Zero-DCE for low-light image enhancement
[ ] CutMix data augmentation for image classification
[ ] MixUp augmentation for image classification
[ ] RandAugment for Image Classification for Improved Robustness
[x] Image captioning @divyashreepathihalli
[ ] Natural language image search with a Dual Encoder
[ ] Visualizing what convnets learn
[x] Model interpretability with Integrated Gradients: @AakashKumarNain
[ ] Investigating Vision Transformer representations
[x] Grad-CAM class activation visualization: @fchollet
[ ] Near-duplicate image search
[ ] Semantic Image Clustering
[x] Image similarity estimation using a Siamese Network with a contrastive loss @hertschuh
[x] Image similarity estimation using a Siamese Network with a triplet loss @hertschuh
[x] Metric learning for image similarity search: @fchollet
[ ] [Nedd TF-Similarity] Metric learning for image similarity search using TensorFlow Similarity
[ ] Video Classification with a CNN-RNN Architecture
[x] Next-Frame Video Prediction with Convolutional LSTMs: @fchollet
[x] Video Classification with Transformers
[ ] Video Vision Transformer
[ ] [Need KerasCV] Semi-supervision and domain adaptation with AdaMatch
[ ] Barlow Twins for Contrastive SSL
[ ] Class Attention Image Transformers with LayerScale
[ ] Consistency training with supervision
[ ] Distilling Vision Transformers
[ ] FixRes: Fixing train-test resolution discrepancy
[ ] Focal Modulation: A replacement for Self-Attention
[ ] Using the Forward-Forward Algorithm for Image Classification
[ ] Gradient Centralization for Better Training Performance
[ ] Knowledge Distillation
[x] Learning to Resize in Computer Vision @divyashreepathihalli
[ ] Masked image modeling with Autoencoders
[ ] Self-supervised contrastive learning with NNCLR
[ ] Augmenting convnets with aggregated attention
[ ] Point cloud segmentation with PointNet
[ ] Semantic segmentation with SegFormer and Hugging Face Transformers
[ ] Self-supervised contrastive learning with SimSiam
[x] Supervised Contrastive Learning @divyashreepathihalli
[ ] When Recurrence meets Transformers
[x] Learning to tokenize in Vision Transformers

NLP

[x] Text classification from scratch: @fchollet
[ ] Review Classification using Active Learning: @sampathweb
[ ] Text Classification using FNet
[ ] Large-scale multi-label text classification
[x] Text classification with Transformer @nkovela1
[x] Text classification with Switch Transformer @nkovela1
[ ] Text classification using Decision Forests and pretrained embeddings
[x] Using pre-trained word embeddings @sampathweb
[x] Bidirectional LSTM on IMDB @sampathweb
[x] English-to-Spanish translation with KerasNLP @nkovela1
[ ] English-to-Spanish translation with a sequence-to-sequence Transformer
[x] Character-level recurrent sequence-to-sequence model: @fchollet
[ ] [Need TF Hub] Multimodal entailment
[ ] Named Entity Recognition using Transformers
[ ] Text Extraction with BERT
[ ] Sequence to sequence learning for performing number addition
[ ] Semantic Similarity with BERT
[ ] End-to-end Masked Language Modeling with BERT
[ ] Pretraining BERT with Hugging Face Transformers
[ ] Training a language model from scratch with 🤗 Transformers and TPUs
[ ] Question Answering with Hugging Face Transformers
[ ] Abstractive Summarization with Hugging Face Transformers

Structured data

[ ] Structured data classification with FeatureSpace
[x] Imbalanced classification: credit card fraud detection: @fchollet
[x] Structured data classification from scratch: @fchollet
[ ] Structured data learning with Wide, Deep, and Cross networks
[ ] Classification with Gated Residual and Variable Selection Networks
[ ] Classification with TensorFlow Decision Forests
[ ] Classification with Neural Decision Forests
[ ] Structured data learning with TabTransformer
[x] Collaborative Filtering for Movie Recommendations: @sampathweb
[ ] A Transformer-based recommendation system

Timeseries

[x] Timeseries classification from scratch @sampathweb
[x] Timeseries classification with a Transformer model: @sampathweb
[x] Electroencephalogram Signal Classification for action identification: @fchollet
[x] Timeseries anomaly detection using an Autoencoder: @fchollet
[x] Traffic forecasting using graph neural networks and LSTM: @fchollet
[x] Timeseries forecasting for weather prediction: @fchollet

Generative

[ ] Denoising Diffusion Implicit Models
[x] A walk through latent space with Stable Diffusion
[ ] DreamBooth
[x] Denoising Diffusion Probabilistic Models: @fchollet
[ ] Teach StableDiffusion new concepts via Textual Inversion
[ ] Fine-tuning Stable Diffusion
[x] Variational AutoEncoder: @fchollet
[x] GAN overriding model train_step
[x] WGAN-GP overriding Model.train_step: @fchollet
[ ] Conditional GAN
[ ] CycleGAN
[ ] Data-efficient GANs with Adaptive Discriminator Augmentation
[x] Deep Dream: @fchollet
[ ] GauGAN for conditional image generation
[ ] PixelCNN
[ ] Face image generation with StyleGAN @nkovela1
[ ] Vector-Quantized Variational Autoencoders
[x] Neural style transfer: @fchollet
[ ] Neural Style Transfer with AdaIN
[x] GPT2 Text Generation with KerasNLP @divyashreepathihalli
[x] GPT text generation from scratch with KerasNLP
[x] [Need KerasNLP] Text generation with a miniature GPT
[x] Character-level text generation with LSTM: @fchollet
[ ] Text Generation using FNet
[ ] Drug Molecule Generation with VAE
[ ] WGAN-GP with R-GCN for the generation of small molecular graphs
[ ] Density estimation using Real NVP

Other

[ ] Automatic Speech Recognition using CTC
[ ] MelGAN-based spectrogram inversion using feature matching
[ ] Speaker Recognition
[ ] Automatic Speech Recognition with Transformer
[ ] English speaker accent recognition using Transfer Learning
[ ] Audio Classification with Hugging Face Transformers
[ ] Actor Critic Method
[ ] Deep Deterministic Policy Gradient (DDPG)
[ ] Deep Q-Learning for Atari Breakout
[ ] Proximal Policy Optimization
[ ] Graph attention network (GAT) for node classification
[ ] Node Classification with Graph Neural Networks
[ ] Message-passing neural network (MPNN) for molecular property prediction
[ ] Graph representation learning with node2vec
[x] Simple custom layer example: Antirectifier: @fchollet
[ ] Probabilistic Bayesian Neural Networks
[ ] Knowledge distillation recipes
[ ] Creating TFRecords
[ ] Keras debugging tips
[x] Endpoint layer pattern: @fchollet
[ ] Memory-efficient embeddings for recommendation systems
[x] A Quasi-SVM in Keras: @fchollet
[ ] Estimating required sample size for model training
[ ] Evaluating and exporting scikit-learn metrics in a Keras callback
[x] Customizing the convolution operation of a Conv2D layer: @fchollet
[x] Writing Keras Models With TensorFlow NumPy: @fchollet
[x] Serving TensorFlow models with TFServing: @fchollet
[ ] How to train a Keras model on TFRecord files
[x] Trainer pattern: @fchollet

List of examples with significant incompatibilities

Trainer pattern
- Reason: trainer subclassing style has significantly change, e.g. no more compiled_loss/compiled_metrics.

List of examples that cannot be converted at all

A Quasi-SVM in Keras.
- Reason: critically uses tf.keras.layers.experimental.RandomFourierFeatures, not included in Keras Core.

AakashKumarNain commented 1 year ago

Wow! Been contributing to these examples for a long time but never realized that we have so many high-quality examples. Amazing feat! 👏

PS: I will setup keras_core GPU env, and will start working on some of the examples I contributed. Will update the issue accordingly

soumik12345 commented 1 year ago

Raised a PR to port the example Zero-DCE for low-light image enhancement to keras_core: https://github.com/keras-team/keras-core/pull/486 Also found a possible bug while doing so: https://github.com/keras-team/keras-core/issues/485

soumik12345 commented 1 year ago

Raised a PR to port example Low-light image enhancement using MIRNet to keras-core: https://github.com/keras-team/keras-core/pull/491

anas-rz commented 1 year ago

Raised PR to port example A Vision Transformer without Attention: keras-team/keras-core#497

anas-rz commented 1 year ago

Raised a PR to port example to keras-core: Compact Convolutional Transformers keras-team/keras-core#523

pksX01 commented 1 year ago

I would like to take 'Question Answering with Hugging Face Transformers' task.

pksX01 commented 1 year ago

I would like to take 'Question Answering with Hugging Face Transformers' task.

I am facing an issue while running after changes and I have raised an issue for the same #18572.

madhusshivakumar commented 1 year ago

I would like to work on

examples/vision/cait.py #18700 Facing issue #18699

examples/vision/metric_learning.py #18701 Facing issue #18698

madhusshivakumar commented 1 year ago

I was working on movielens recommendations and raised PR for the same #18690

ben-ad commented 1 year ago

I would like to work on "Text extraction with Bert" stage 1.

PS: the status on what has been done already is not up to date in the stage 1 list above.

sitamgithub-MSIT commented 12 months ago

"Image Classification using BigTransfer (BiT)" is the task that I would like to take on. Image Classification using BigTransfer (BiT)

Note: As of right now, I will focus on Stage 1: the tf.keras backward compatibility check. To make the backend agnostic, the remainder will attempt to work on stage 2.

sitamgithub-MSIT commented 11 months ago

"Image Classification using BigTransfer (BiT)" is the task that I would like to take on. Image Classification using BigTransfer (BiT)

Note: As of right now, I will focus on Stage 1: the tf.keras backward compatibility check. To make the backend agnostic, the remainder will attempt to work on stage 2.

Update: Find some issues specifically related to the model at tf.hub (now Kaggle Models). Anyway, I will create a detailed issue of my problems in one or two days.

sitamgithub-MSIT commented 11 months ago

"Train a Vision Transformer on small datasets" is the task that I would like to take on next. Train a Vision Transformer on small datasets

pksX01 commented 11 months ago

"Train a Vision Transformer on small datasets" is the task that I would like to take on next. Train a Vision Transformer on small datasets

@sitamgithub-MSIT I am already working on this from last couple of days, I had also raised issue which I was facing in this script but now that issue is gone and Stage 1 is already completed. I will raise PR soon.

Please select different problem/ example.

innat-asj commented 11 months ago

I've just noticed this keras example, Semi-supervised image classification using contrastive pretraining with SimCLR ( link ) has changed significantly, original author @beresandras , updated by @ariG23498. The core contributed part (SimCLR modelling) is replaced with built-in API (why). Also, what is the purpose of keras_cv.training, looks uncertain API.

class SimCLRTrainer(keras_cv.training.ContrastiveTrainer):
    def __init__(self, encoder, augmenter, projector, probe=None, **kwargs):
        super().__init__(
            encoder=encoder,
            augmenter=augmenter,
            projector=projector,
            probe=probe,
            **kwargs,
        )

simclr_model = SimCLRTrainer(...)

ariG23498 commented 11 months ago

That was part of the Keras Sprint.

CC: @martin-gorner

sitamgithub-MSIT commented 11 months ago

Working on the MixUp augmentation for image classification

innat-asj commented 11 months ago

That was part of the Keras Sprint.

CC: @martin-gorner

Looks like it (link) is reverted to its original form.

sitamgithub-MSIT commented 11 months ago

Working on the MixUp augmentation for image classification

It seems like it was already converted to Keras 3.0. The above list is not updated, and moreover, on the Keras website, it is showing as Keras 2.0.

CC: @fchollet

sitamgithub-MSIT commented 11 months ago

Working on the RandAugment for Image Classification for Improved Robustness

innat commented 10 months ago

Instead of replacing Keras 2 example with Keras 3, why not keeping both in the code example? As tf.keras is not getting invalid any time soon or is it?

keras-team / keras