Torchmetrics v1.2 is out now! The latest release includes 11 new metrics within a new subdomain: Clustering.
In this blog post, we briefly explain what clustering is, why it’s a useful measure and newly added metrics that can be used with code samples.
Clustering - what is it?
Clustering is an unsupervised learning technique. The term unsupervised here refers to the fact that we do not have ground truth targets as we do in classification. The primary goal of clustering is to discover hidden patterns or structures within data without prior knowledge about the meaning or importance of particular features. Thus, clustering is a form of data exploration compared to supervised learning, where the goal is “just” to predict if a data point belongs to one class.
The key goal of clustering algorithms is to split data into clusters/sets where data points from the same cluster are more similar to each other than any other points from the remaining clusters. Some of the most common and widely used clustering algorithms are K-Means, Hierarchical clustering, and Gaussian Mixture Models (GMM).
An objective quality evaluation/measure is required regardless of the clustering algorithm or internal optimization criterion used. In general, we can divide all clustering metrics into two categories: extrinsic metrics and intrinsic metrics.
Extrinsic metrics
Extrinsic metrics are characterized by requirements of some ground truth labeling, even if used for an unsupervised method. This may seem counter-intuitive at first as we, by clustering definition, do not use such ground truth labeling. However, most clustering algorithms are still developed on datasets with labels available, so these metrics use this fact as an advantage.
Intrinsic metrics
In contrast, intrinsic metrics do not need any ground truth information. These metrics estimate inter-cluster consistency (cohesion of all points assigned to a single set) compared to other clusters (separation). This is often done by comparing the distance in the embedding space.
Update to Mean Average Precision
MeanAveragePrecision, the most widely used metric for object detection in computer vision, now supports two new arguments: average and backend.
The average argument controls averaging over multiple classes. By the core definition, the default way is macro averaging, where the metric is calculated for each class separately and then averaged together. This will continue to be the default in Torchmetrics, but now we also support the setting average="micro". Every object under this setting is essentially considered to be the same class, and the returned value is, therefore, calculated simultaneously over all objects.
The second argument - backend, is important, as it indicates what computational backend will be used for the internal computations. Since MeanAveragePrecision is not a simple metric to compute, and we value the correctness of our metric, we rely on some third-party library to do the internal computations. By default, we rely on users to have the official pycocotools installed, but with the new argument, we will also be supporting other backends.
Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.
Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- `@dependabot show ignore conditions` will show all of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
Bumps torchmetrics from 0.11.4 to 1.2.0.
Release notes
Sourced from torchmetrics's releases.
... (truncated)
Changelog
Sourced from torchmetrics's changelog.
... (truncated)
Commits
0d7d6c7
releasing 1.2.0e2150c5
docs: switch to new Lit template (#2097)e265e1b
docs: fix MD links (#2098)809bb5f
precommit: unify formatting with prettier (#2096)05a36de
adding sample slash command - rebase (#2092)25bf259
refactor: unify onnum
instead ofnb
orn
(#2090)6bf705e
Add section on wrappers to overview pages (#2085)5699241
ci: minor rename labeler's config file462e4fd
Minor fixes before next release (#2086)fe5f46f
build(deps): update matplotlib requirement from <=3.7.2,>=3.2.0 to >=3.2.0,<3...Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting
@dependabot rebase
.Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show