catalyst-team / catalyst

Accelerated deep learning R&D
https://catalyst-team.com
Apache License 2.0
3.3k stars 388 forks source link

MultilabelPrecisionRecallF1SupportCallback do not working #1391

Closed Inkorak closed 2 years ago

Inkorak commented 2 years ago

🐛 Bug Report

First, it has no threshold argument, as shown in the minimal example. And if we fix it and use it without threshold argument. All metrics are always zero And if you add an argument with the num_classes, then nothing changes either. The metrics are still zero.

How To Reproduce

Just use minimal example in your docs for multilabel classification with MultilabelPrecisionRecallF1SupportCallback.

Code sample

import torch
from torch.utils.data import DataLoader, TensorDataset
from catalyst import dl

# sample data
num_samples, num_features, num_classes = int(1e4), int(1e1), 4
X = torch.rand(num_samples, num_features)
y = (torch.rand(num_samples, num_classes) > 0.5).to(torch.float32)

# pytorch loaders
dataset = TensorDataset(X, y)
loader = DataLoader(dataset, batch_size=32, num_workers=1)
loaders = {"train": loader, "valid": loader}

# model, criterion, optimizer, scheduler
model = torch.nn.Linear(num_features, num_classes)
criterion = torch.nn.BCEWithLogitsLoss()
optimizer = torch.optim.Adam(model.parameters())
scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, [2])

# model training
runner = dl.SupervisedRunner(
    input_key="features", output_key="logits", target_key="targets", loss_key="loss"
)
runner.train(
    model=model,
    criterion=criterion,
    optimizer=optimizer,
    scheduler=scheduler,
    loaders=loaders,
    logdir="./logdir",
    num_epochs=3,
    valid_loader="valid",
    valid_metric="accuracy",
    minimize_valid_metric=False,
    verbose=True,
    callbacks=[
        dl.BatchTransformCallback(
            transform=torch.sigmoid,
            scope="on_batch_end",
            input_key="logits",
            output_key="scores"
        ),
        dl.AUCCallback(input_key="scores", target_key="targets"),
        # uncomment for extra metrics:
        dl.MultilabelAccuracyCallback(input_key="scores", target_key="targets", threshold=0.5),
        dl.MultilabelPrecisionRecallF1SupportCallback(
             input_key="scores", target_key="targets", 
         ),
    ]
)

Expected behavior

I expect it to work.

Environment

Catalyst version: 21.12 PyTorch version: 1.10.0+cu111 Is debug build: No CUDA used to build PyTorch: 11.1 TensorFlow version: 2.7.0 TensorBoard version: 2.7.0

OS: Ubuntu 18.04.5 LTS GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 CMake version: version 3.12.0

Python version: 3.7 Is CUDA available: No CUDA runtime version: No CUDA GPU models and configuration: No CUDA Nvidia driver version: No CUDA cuDNN version: No CUDA

Checklist

FAQ

Please review the FAQ before submitting an issue:

github-actions[bot] commented 2 years ago

Hi! Thank you for your contribution! Please re-check all issue template checklists - unfilled issues would be closed automatically. And do not forget to join our slack for collaboration.

bagxi commented 2 years ago

Could you please add additional BatchTransformCallback:

import torch
from torch.utils.data import DataLoader, TensorDataset
from catalyst import dl

# sample data
num_samples, num_features, num_classes = int(1e4), int(1e1), 4
X = torch.rand(num_samples, num_features)
y = (torch.rand(num_samples, num_classes) > 0.5).to(torch.float32)

# pytorch loaders
dataset = TensorDataset(X, y)
loader = DataLoader(dataset, batch_size=32, num_workers=1)
loaders = {"train": loader, "valid": loader}

# model, criterion, optimizer, scheduler
model = torch.nn.Linear(num_features, num_classes)
criterion = torch.nn.BCEWithLogitsLoss()
optimizer = torch.optim.Adam(model.parameters())
scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, [2])

# model training
runner = dl.SupervisedRunner(
    input_key="features", output_key="logits", target_key="targets", loss_key="loss"
)
runner.train(
    model=model,
    criterion=criterion,
    optimizer=optimizer,
    scheduler=scheduler,
    loaders=loaders,
    logdir="./logdir",
    num_epochs=3,
    valid_loader="valid",
    valid_metric="accuracy",
    minimize_valid_metric=False,
    verbose=True,
    callbacks=[
        dl.BatchTransformCallback(
            transform=torch.sigmoid,
            scope="on_batch_end",
            input_key="logits",
            output_key="scores"
        ),
        dl.BatchTransformCallback(  # apply threshold of 0.5, workaround for PrRcF1 callback
            transform=lambda outputs: torch.gt(outputs, 0.5).long(),
            scope="on_batch_end",
            input_key="scores",
            output_key="preds"
        ),
        dl.AUCCallback(input_key="scores", target_key="targets"),
        dl.MultilabelAccuracyCallback(input_key="scores", target_key="targets", threshold=0.5),
        dl.MultilabelPrecisionRecallF1SupportCallback(
             input_key="preds", target_key="targets", 
         ),
    ]
)
Inkorak commented 2 years ago

Thanks, everything is working now. But I would like to see more up-to-date documentation.