Config Refactor - Githubissues

This PR aims to make the config more user-friendly for the release.

Config

[x] Revise vis4d.config hierarchy.

vis4d.config
|--- common
|       |--- models
|       `--- datasets
|--- defaults
|      |--- data_connectors
|      |--- pl_trainer.py
|      `--- runtime.py
|--- util
|      |--- dataloader.py
|      |--- optimizer.py
|      `--- sweep.py
|--- config_dict.py
|--- parser.py
|--- replicator.py
`--- show_connection.py

[x] vis4d.config.config_dict
- Rename our ConfigDict to FieldConfigDict and only use it for parameter links (e.g. hyper-parameters, data, output_dir…), otherwise using ml_collections.ConfigDict. Also, make sure it is in the value_mode before instantiate_classes.
- Revise DelayedInstantiator → work with instantiate for different cases.
- Revise instantiate_classes→ Check whether the input is ConfigDict. If the config is FieldConfig, ensure it is in the value mode. Handle the init_args is empty but with kwargs to instantiate together.
- Revise copy_and_resolve_references → Fix typing. Copy and resolve references to ConfigDict instead of FieldConfigDict in value mode. Handle dictionary.
- _instantiate_classes → Fix typing. Handle dictionary.

Loss Module

[x] Rename form vis4d.engine.loss to vis4d.engine.loss_module. It maps the input key from prediction & data to each loss function input key correctly and provides loss weighting control.
[x] New LossDefinition
- loss: Loss function from vis4d.op.loss / nn.Module.
- connector: LossConnector for key mapping.
- weight: Loss weighting (Default to 1.0).
- name: Loss name (Default will be the loss function class name).

Data Connector

[x] Remove StaticDataConnector, DataConnectorInfo
[x] Modularize vis4d.engine.connectors and separate DataConnector → Resolve Loss Module input keys, reuse connector for callbacks, and separate train and test connector (e.g. now can use DataConnector for training and MultiSensorDataConnector for inference):
- Base
  - DataConnector: Used for Trainer (train / test data connector).
  - LossConnector: Used for Loss Module.
  - CallbackConnector: Used for Callback (train / test connector)
- Multi-Sensor
  - MultiSensorDataConnector
  - MultiSensorLossConnector
  - MultiSensorCallbackConnector

Callbacks

[x] Move to vis4d.engine.
[x] Modularize Callbacks.
[x] Remove shared_callbacks, train_callbacks and test_callbacks. Everything is callbacks and controlled directly by the Trainer.

[x] Add TrainerState for callback input.

current_epoch
num_epochs
global_step
train_dataloader
num_train_batches
test_dataloader
num_test_batches
metrics

[x] Add get_train_callback_inputs & get_test_callback_inputs.
VisualizerCallback
- [x] Let it show / save_to_disk per batch.
- [x] Move the control of whether to show / save to visualizer modules.

Dataset

[x] Make keys_to_load truly control data dict (COCO).
[x] Add sample_names and original_images for visualization.

Visualize module:

[x] Revise canvas, viewer architecture.
[x] Move more control from callback to visualizer modules.

Optim module

[x] Merge vis4d.optim and vis4d.engine.opt as vis4d.engine.optim.

Bug Fix

Logging callback:
- [x] Log / Plot average loss metrics over refresh_rate steps. Align PL logger to log the log_dict from logging callback.
- [x] Instead of logging the loss Tensor with graph, now will detach and move to cpu as float first.
[x] Dataconnector key, value was wrong for train and test.
[x] Show the warning of unsuccessful transforms.
[x] Fix Visualizer color typing.
[x] Resolve the mixture usage of ConfigDict and FieldConfigDict.

Callback base class does not need every_n_epochs or num_epochs, the trainer should decide, e.g. when to run the validation loop. Currently, the validation loop could be executed without running the evaluator after. Hence I'd propose moving the every_n_epochs to the trainer as in PL. For callbacks that should only be run occasionally, this could be implemented in the specific callback (e.g. visualizer) instead of the base class.

Training loop figure

┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ Trainer                                                                                                                  │
│                                                                                                                          │
│ ┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │
│ │ Training loop                                                                                                        │ │
│ │                                                                                                                      │ │
│ │ ┌───────────────────┐ batch ┌──────────────────────┐                                                                 │ │
│ │ │ Train Data Loader ├──┬────► Train Data Connector │                                                                 │ │
│ │ └───────────────────┘  │    └──────────┬───────────┘                                                                 │ │
│ │                        │               │ key map                                                                     │ │
│ │                        │           ┌───▼───┐                                                                         │ │
│ │                        │           │ Model │                                                                         │ │
│ │                        │           └───┬───┘   ┌───────────────────────────────────────────────────────────────────┐ │ │
│ │                        │               │       │ LossModule                                                        │ │ │
│ │                        │               │ Pred  │ ┌─────────────────────────────────────────────┐                   │ │ │
│ │                        │               │       │ │ Loss_i                                      │                   │ │ │
│ │                        │               │       │ │                                             │                   │ │ │
│ │                        │      ┌────────┴───────► │ pred─┐ ┌────────────────┐key map┌─────────┐ │                   │ │ │
│ │                        │      │                │ │      ├─►Loss Connector i├───────►loss op i├─┼─►loss_i x weight_i│ │ │
│ │         Trainer State  ├───── │ ───────────────► │ data─┘ └────────────────┘       └─────────┘ │                   │ │ │
│ │                │       │      │                │ │                                             │                   │ │ │
│ │                │       │      │                │ └─────────────────────────────────────────────┘                   │ │ │
│ │                │       │      │                │                       .                                           │ │ │
│ │                │       │      │                │                       .                                           │ │ │
│ │                │       │      │                │                       .                                           │ │ │
│ │                │       │      │                └─────────────────────────────┬─────────────────────────────────────┘ │ │
│ │   ┌────────────▼───────▼──────▼──────────┐                                   │                                       │ │
│ │   │ Callbacks                            │                                   │                                       │ │
│ │   │           ┌ -- -- -- -- -- -- -- - ┐ │                                   ▼                                       │ │
│ │   │           |Train Callback Connector| │                                 losses                                    │ │
│ │   │           └ - -- -- -- -- -- -- -- ┘ │                                   │                                       │ │
│ │   │                                      │                                ┌──▼──┐                                    │ │
│ │   └────────────────┬─────────────────────┘                                │ Sum │                                    │ │
│ │                    │ key map & key args                                   └──┬──┘                                    │ │
│ │      ┌             ▼                  ┐                                      │                                       │ │
│ │        ┌──────────┐ ┌─────────┐                                              ▼          ┌──────────┐                 │ │
│ │        │Visualizer│ │Evaluator│ ......                                   total_loss─────► Backward │                 │ │
│ │        └──────────┘ └─────────┘                                                         └──────────┘                 │ │
│ │      └                                ┘                                                                              │ │
│ │                                                                                                                      │ │
│ │                                                                                                                      │ │
│ │                                                                                                                      │ │
│ └──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ │
│                                                                                                                          │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

Also, we should rename class_config to class_cfg or similar

Should we rename the class_config in this PR? We should also create the required PRs or add it to the roadmap for the minor missing things.

Other than that LGTM :)

Should we rename the class_config in this PR? We should also create the required PRs or add it to the roadmap for the minor missing things.

Other than that LGTM :)

I think we will have an Abbrivation PR to rename many things. Maybe let's do it there?

SysCV / vis4d

Config Refactor #95