catalyst-team / catalyst

Accelerated deep learning R&D
https://catalyst-team.com
Apache License 2.0
3.3k stars 388 forks source link

Custom loader stages #1432

Closed YodaEmbedding closed 1 year ago

YodaEmbedding commented 2 years ago

🚀 Feature Request

In addition to loader ∈ [train, valid, infer], a user should be able to define a custom loader stage.

EDIT:

Looks like self.loader_key already allows for this.

https://github.com/catalyst-team/catalyst/blob/e99f90655d0efcf22559a46e928f0f98c9807ebf/catalyst/core/runner.py#L388-L392


Motivation

The purpose of loader is to switch between datasets that will be fed into the pipeline. Therefore, the natural use cases are:

  1. Running inference on multiple datasets (e.g. infer_coco, infer_kodak, infer_vimeo90k, ...) and tracking their metrics in the same way as infer.
  2. Other custom stages that involve analysis using subsets of data and data loaders.

Proposal

self.{train/valid/infer} variables need to be converted to functions/dictionaries. For example:

def handle_batch(batch):
    # Previously:
    if self.is_infer_coco_loader:
        ...
    # Proposed:
    if self.loader_key == "infer_coco":
        ...
    if self.loader_key == "infer_vimeo90k":
        ...

self.is_infer_loader and similar can be kept, and perhaps even later deprecated.

Type-hints should be naturally converted via the type transformation T -> Mapping[str, T].

Alternatives

Implementing entire "infer" loop from scratch in on_epoch_end for each dataset/loader. Or chaining dataloaders (kind of weird, too). These wouldn't really be clean, and would not generalize to other non-standard use cases.

Additional context

N/A

Checklist

FAQ

Please review the FAQ before submitting an issue:

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.