project-lighter / lighter

Config-based framework for organized and reproducible deep learning. MONAI Bundle + PyTorch Lightning.
https://project-lighter.github.io/lighter
MIT License
21 stars 2 forks source link

`DataLogger`/`DataMonitor` callback for logging input, pred, and target #103

Open ibro45 opened 7 months ago

ibro45 commented 7 months ago

🚀 Feature Request

image

🔈 Motivation

After reverting to PL loggers instead of LighterLogger, we lost the ability to log input/pred/target.

🛰 Alternatives

📎 Additional context

dosubot[bot] commented 7 months ago

🤖

Sure, it's possible to add a DataLogger/DataMonitor callback to the Lighter framework to log input, prediction, and target data. This can be achieved by extending the LighterBaseWriter class and implementing the writers property and write method.

In the on_predict_batch_end method, the outputs dictionary contains the predictions and the batch variable contains the input data. You can extend this method to also log the input and target data.

Here is an example of how you can do this:

class DataLogger(LighterBaseWriter):
    @property
    def writers(self) -> Dict[str, Callable]:
        return {
            "csv": self.write_csv,
            # Add other writers here
        }

    def write(self, tensor: torch.Tensor, id: int) -> None:
        # Implement your writing strategy here
        pass

    def write_csv(self, tensor: torch.Tensor, id: int) -> None:
        # Implement your CSV writing strategy here
        pass

    def on_predict_batch_end(
        self, trainer: Trainer, pl_module: LighterSystem, outputs: Any, batch: Any, batch_idx: int, dataloader_idx: int = 0
    ) -> None:
        super().on_predict_batch_end(trainer, pl_module, outputs, batch, batch_idx, dataloader_idx)

        # Log input and target data
        input_data = batch["input"]
        target_data = batch["target"]
        for id, input, target in zip(outputs["id"], input_data, target_data):
            # Log input and target data here
            pass

In this example, DataLogger is a new callback that logs the input, prediction, and target data. The writers property returns a dictionary of writer functions, and the write method defines how a tensor should be saved. The on_predict_batch_end method is extended to also log the input and target data.

You can find more details about the LighterBaseWriter class in the Lighter codebase.

As for the PyTorch Lightning logger, it doesn't directly handle the logging of input, prediction, and target data. Instead, it uses a callback system to handle these tasks. The on_predict_batch_end method is a callback method that is executed at the end of each prediction batch/step. If the IDs are not provided, it generates global unique IDs based on the prediction count. Finally, it writes the predictions using the specified writer.

I hope this helps! If you have any other questions or need further clarification, feel free to ask.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.