Closed erezalg closed 4 years ago
@erezalg no problems with asking questions here :)
First, It depends on the Loss I chose. I use nn.CrossEntropyLoss() which has softmax, so in my metric I also had to add it. But I assume if I use another Loss without softmax I won't need it (and my network will handle that).
All metrics and custom ones support output_transform
which can adapt the output of your network to the input of the metric: logits, probas, thresholded prediction for binary case etc.
So, you can set
def update(self, output):
...
for predtensor,real in zip(y_pred,y):
pred = torch.argmax(predtensor, 0)
and setup the metric like
acc_per_class = CustomAccuracy(output_transform=lambda output: F.softmax(output[0], 0), output[1])
Second, this feels like something that should be out of the box, so I'm wondering if I'm missing something.
We have per class out-of-the-box metrics for ignite.metrics.Precision and ignite.metrics.Recall. For ignite.metrics.Accuracy we were inspired from sklearn where there is no such option...
Maybe, another out-of-the-box solution can be to define 10 instancies of ignite.metrics.Accuracy with mapping targets and predictions to a binary case with output_transform
.
Thanks @vfdev-5 ! Your suggestion with the output transform did the trick!
Regarding the suggestion to create 10 instances of an ignite.metrics.Accuracy, This sounds interesting as I can just use existing infra and not add anything else if I get you correctly. I'm not sure how to do this though, how do I tell the metric to look only at specific class's accuracy? Also, How do we define accuracy? let's say my dataset has 500 images with 0 cats and it didn't predict any image has a cat, do I have cat accuracy of 100%?
And if we are talking, I want to ask another question :) I want another custom metric that makes use of input image. let's say every end of epoch I want to report 10 images with their predictions, so I basically need x and y_pred. I followed the example in the documentation, but add output_transform: output_transform=lambda x, y, y_pred: {"x": x, "y": y, "y_pred": y_pred} on all of the evaluator's metrics doesn't work, because as far as I understand, it does it also to my loss function which expects only 2 tensors in output. When I do something like: my_metric = MyMetric(output_transform=lambda x, y, y_pred: {"x": x, "y": y, "y_pred": y_pred}) my_metric.attach(evaluator,'mymetric')
Which from what I gather, will only transform the output for this metric.
But it doesn't work with this error:
TypeError:
Not sure if it's because of my python skills or my ignite skills but I can't figure out how to solve this :) help is appreciated!
Thanks
Your suggestion with the output transform did the trick!
@erezalg glad that it worked :)
Regarding the suggestion to create 10 instances of an ignite.metrics.Accuracy, This sounds interesting as I can just use existing infra and not add anything else if I get you correctly. I'm not sure how to do this though, how do I tell the metric to look only at specific class's accuracy? Also, How do we define accuracy? let's say my dataset has 500 images with 0 cats and it didn't predict any image has a cat, do I have cat accuracy of 100%?
Accuracy is defined as (TP + TN) / (TP + TN + FP + FN)
. Accuracy per class will be something like binary accuracy for a single class. Yes, in your example with 0 cats in 500 images and 0 predictions of cat, i'd say the accuracy for predicting cat is 100%. Please, keep in mind that mean of these binary accuracies is not overall accuracy.
Code snippet for 5 classes (easy to check)
from functools import partial
import torch
from ignite.utils import to_onehot
from ignite.engine import Engine
from ignite.metrics import Accuracy
torch.manual_seed(0)
num_classes = 5
batch_size = 4
acc_per_class = {}
def ot_per_class(output, index):
y_pred, y = output
# probably, we have to apply torch.sigmoid if output is logits
y_pred_bin = (y_pred > 0.5).to(torch.long)
y_ohe = to_onehot(y, num_classes=num_classes)
return (y_pred_bin[:, index], y_ohe[:, index])
for i in range(num_classes):
acc_per_class["acc_{}".format(i)] = Accuracy(output_transform=partial(ot_per_class, index=i))
def processing_fn(e, b):
y_true = torch.randint(0, num_classes, size=(batch_size, ))
y_preds = torch.rand(batch_size, num_classes)
print("y_true:", y_true)
print("y_preds:", (y_preds > 0.5).to(torch.long))
return y_preds, y_true
engine = Engine(processing_fn)
for n, acc in acc_per_class.items():
acc.attach(engine, name=n)
engine.run([0, ])
engine.state.metrics
> y_true: tensor([1, 4, 1, 4])
y_preds: tensor([[0, 1, 1, 0, 0],
[0, 0, 1, 0, 0],
[0, 0, 0, 1, 1],
[1, 1, 0, 1, 0]])
{'acc_0': 0.75, 'acc_1': 0.5, 'acc_2': 0.5, 'acc_3': 0.5, 'acc_4': 0.25}
Class 0 was wrongly predicted only once for the last sample and thus accuracy for class 0 is 3.0 / 4.0.
And if we are talking, I want to ask another question :) I want another custom metric that makes use of input image. let's say every end of epoch I want to report 10 images with their predictions, so I basically need x and y_pred. I followed the example in the documentation, but add output_transform: output_transform=lambda x, y, y_pred: {"x": x, "y": y, "y_pred": y_pred} on all of the evaluator's metrics doesn't work, because as far as I understand, it does it also to my loss function which expects only 2 tensors in output. When I do something like: my_metric = MyMetric(output_transform=lambda x, y, y_pred: {"x": x, "y": y, "y_pred": y_pred}) my_metric.attach(evaluator,'mymetric')
Is it something like that you would like to do : https://discuss.pytorch.org/t/how-access-inputs-in-custom-ignite-metric/91221/6 ? Please, let me know if it helps, otherwise a minimal code snippet would be helpful to understand you issue :)
hi @vfdev-5
Your example is spot on :) I feel like it's a little less intuitive to read (at least from me) than calculating it directly. Anyway, I think we should have these examples somewhere. I think it'd be really nice if people could search for "multi class accuracy" and find concrete examples. Where do you think the best place to put these? I'm not saying that your calculation of class accuracy is wrong (it obviously isn't :) ), I'm just saying that the metric of "cats predicted correctly / total cats in dataset" metric has a great value when analyzing your data! I thought maybe a blog post? or some kb?
And for my second question, here you go:
Not the most amazingly organized code. Anyway, I borrowed from the thread you pointed me at and tried modifying it to my needs, but it doesn't work and I'm not sure why.
Thanks!!
I feel like it's a little less intuitive to read (at least from me) than calculating it directly.
@erezalg do you think it would be better to have out-of-the-box solution for that ? Yes, true that we can add this code snippet and some details in our FAQ or somewhere else. We also recommend in README, to find in the issues labeled as question
...
'm just saying that the metric of "cats predicted correctly / total cats in dataset" metric has a great value when analyzing your data!
actually, I have an impression that, it is Recall metric per class you'd like. And we have precision/recall per class.
And for my second question, here you go:
The problem is with the output of the evaluator. It should return a dictionary with the keys you'd like to use inside TBReport.
val_metrics = {
"accuracy": Accuracy(),
"cel": Loss(criterion, output_transform=lambda out_dict: (out_dict["y_pred"], out_dict["y"])),
"tbrpt": TBReport()}
evaluator = create_supervised_evaluator(
net, metrics=val_metrics, device=device,
output_transform=lambda x, y, y_pred: {"x": x, "y": y, "y_pred": y_pred}
)
I wonder what would you like to to do inside TBReport
metric ? I hope it is not for logging to tensorboard :)
Otherwise, please take a look here : https://labs.quansight.org/blog/2020/09/pytorch-ignite/#Common-training-handlers
after "It is possible to extend the use of the TensorBoard logger very simply by integrating user-defined functions. For example, here is how to display images and predictions during training:"
thanks @vfdev-5! As usual you are spot on :) Everything works better than what I was trying to do myself :D
One last question that I couldn't figure out myself is, when I have multiclass recall, the name of the class in the TB graph is the label enumeration. I assume I need some output_transform to change that to the class name, but couldn't figure out how to do it.
Thanks A LOT!!!
@erezalg thanks for the feedback !
One last question that I couldn't figure out myself is, when I have multiclass recall, the name of the class in the TB graph is the label enumeration.
Unfortunately, it is not possible out-of-the-box to add labes. This is something I was also thinking about to add as a feature request (if you'd like to send one, it could be helpful).
The limitation is due to a tensor nature of metrics output: we have for example Recall metric that outputs torch.tensor([0.1, 0.2, 0.3, ..., 0.8])
and this is directly used within OutputHandler
for the Tensorboard:
https://github.com/pytorch/ignite/blob/75e20420a2391ad2e11ee17df65f781a659ae6ec/ignite/contrib/handlers/tensorboard_logger.py#L290-L291
However, there is a workaround to that. Idea is to create N metrics that output scalars instead of a single metric that gives a tensor. Thus we can label the metric as we'd like. We can use metrics arithmetics for that. Something like that should work:
num_classes = 10
cls_name_mapping = ["car", ...]
val_metrics = {}
for i in range(num_classes):
cls_name = cls_name_mapping[i]
val_metrics["Recall/{}".format(cls_name)] = Recall(average=False)[i].item()
@vfdev-5 That did the trick. Would've been nice if you could give a dict object (mapping label enumeration to class) or just a list of strings but that works too and it's not TOO ugly.
Will most definitely open a feature request. The way it looks, maybe it's better to input the class list directly to the TensorboardLogger object instantiation? Seems like changing the API of metrics is complicated, esp when it's only for TB visualization. If the OutputHandler class gets metric_names, it's not weird to also give metric_classes
The way it looks, maybe it's better to input the class list directly to the TensorboardLogger object instantiation?
It is not only TB related but all other supported exp tracking systems too. For instance, I do not know where it would be better to add this meta info.
I see, OK makes sense! I'll open the feature request and let's see where it leads!
Thanks again!
I would love to have the possibility of computing accuracy per each class out of the box too :) It would be nice to have the parameter average
for Accuracy
as provided for Recall
and Precision
.
I really looked at issues and googled the question but couldn't find anything I could use so here goes.
I'm porting some code from vanila pytorch to Ignite, and have a CIFAR10 classifier. Every end of Evaluation epoch, I want to report the per class accuracy. In Ignite, I only found total accuracy (which I use) but not a per class one. I wrote this CustomMetric, but I have a few problems with it:
First, It depends on the Loss I chose. I use nn.CrossEntropyLoss() which has softmax, so in my metric I also had to add it. But I assume if I use another Loss without softmax I won't need it (and my network will handle that). Second, this feels like something that should be out of the box, so I'm wondering if I'm missing something.
Any advice?