Open zhu0619 opened 2 months ago
Hello, I'm working on the EvoScale BioML hackathon so I'm under time constraints...
I have a custom metric that doesn't fit into the two-column specification of your design. it would be great to connect with you to help out with a solution for our submission. Thanks.
Hi @wconnell , thanks for reaching out. Could you provide some more details on what you're trying to do? Custom metrics is a complex feature we won't be able to implement soon, but maybe I can help rethink the structure of your dataset / benchmark to a format that we can already support.
hey, thanks for getting back to me @cwognum
we are uploading a new dataset called OpenPlasmid. our evaluation looks at how well different plasmid sequence embedding methods reflect the similarity of plasmid feature annotations. so, we basically take the plasmid embeddings of a new method, cluster them, and then compute NMI and ARI relative to labels to quantify expected similarity.
any ideas how this can fit into your framework?
Oeh, that's interesting, but I'm afraid it's not a great fit for the current state of the platform... We've been focused on predictive modeling.
However, it sounds like you could ask people to submit the cluster annotations and then compare that against the ground truth clustering.
So, for example, your dataset may look like: | Plasmid ID | Cluster |
---|---|---|
0 | 0 | |
1 | 0 | |
2 | 1 | |
3 | 1 |
The predictions would then e.g. look like: [1, 1, 0, 2]
.
So:
from sklearn.metrics.cluster import normalized_mutual_info_score as nmi
from sklearn.metrics import adjusted_rand_score as ari
nmi([0, 0, 1, 1], [1, 1, 0, 2])
# Gives: ~0.800
ari([0, 0, 1, 1], [1, 1, 0, 2])
# Gives: ~0.571
That does mean you don't have any control over the clustering algorithm. I understand that that may not be ideal, but you could make clear in your README how people are supposed to do the clustering and ask people to attach a link to the code when they submit results (Polaris has a dedicated field for this, see here).
ok, yeah thats prob a sufficient work around for now. thanks for the suggestion!
hey, figured I'd be back.. realizing that there is not a way to add new metrics for usage with SingleTaskBenchmarkSpecification
? Doesnt seem like I can extend the Metric
class
Hey! You can, but it's a bit of a manual process. You'll have to create a PR in this repo. See e.g. https://github.com/polaris-hub/polaris/pull/199 and https://github.com/polaris-hub/polaris/pull/48 .
This is a bit of a frustrating process, which is why we created this issue to begin with.
Is your feature request related to a problem? Please describe.
New metrics need to be implemented and hard coded in the Polaris client whenever new modality and benchmarks introduce them. This has been the bottleneck for the benchmark creation process.
Describe the solution you'd like
An approach is needed that allows flexible, customized metrics while maintaining the robustness of the Polaris client codebase. Balancing flexibility with stability is key to ensuring that users can easily introduce new metrics without compromising the integrity of the system.