openproblems-bio / openproblems

Formalizing and benchmarking open problems in single-cell genomics
MIT License
290 stars 77 forks source link

Make "Majority Vote" a baseline method #710

Closed rcannood closed 2 days ago

rcannood commented 1 year ago

I think is_baseline is meant to be set to True, given that nobody will actually use this type of approach in real life and call it a day. And also given that it's in the baseline.py file.

Submission type

scottgigante-immunai commented 1 year ago

Honestly, given this performs worse than random, I'm inclined to just delete it.

rcannood commented 1 year ago

I wouldn't! In the v2 and NeurIPS2021 repos we used use single-value predictor components as negative controls to check whether any of the metrics and other downstream components behave strangely when the predictions are all the same string, all 0's or all 1's.

scottgigante-immunai commented 1 year ago

Sure, that makes sense - but then it should be a different class of method. It's not a good measure of what zero should be, even if it's a useful test case. We could define a separate flag (is_test_case instead of is_baseline) but I don't think zero should be defined by this.

LuckyMD commented 1 year ago

Agree that this should be kept. It's a baseline in the truest sense of the word... not an intentionally good or bad method. I like the addition of a new "simple default" class... before we have that... it should probably not be an is_baseline=True method based on how we use this atm.

scottgigante-immunai commented 1 year ago

@LuckyMD are you arguing for this to be included in the results, or included in the scaling computation? I don't think we should scale to something that we know is worse than random.

On Mon, 28 Nov 2022, 7:59 am MalteDLuecken, @.***> wrote:

Agree that this should be kept. It's a baseline in the truest sense of the word... not an intentionally good or bad method. I like the addition of a new "simple default" approach.

— Reply to this email directly, view it on GitHub https://protect.checkpoint.com/v2/___https://github.com/openproblems-bio/openproblems/pull/710%23issuecomment-1329048484___.YzJlOmltbXVuYWk6YzpnOmQ3NzlkNjdhNzJjOTk0OTJlZmQwMmRiNjg4ZmMyNGJjOjY6Y2Y3ODo1MDM5ZDBmN2IzMzI2ZDc5NDQ2ZjU2OTM4NmQ3ZmI3OWMzN2QwMDIyNjFmZDEyNzdiMGIzMTA0NDYwZDVjYmE2Omg6VA, or unsubscribe https://protect.checkpoint.com/v2/___https://github.com/notifications/unsubscribe-auth/AUHCMAS5ECZCSFTAKWYUSJDWKSUA7ANCNFSM6AAAAAASM4H2WE___.YzJlOmltbXVuYWk6YzpnOmQ3NzlkNjdhNzJjOTk0OTJlZmQwMmRiNjg4ZmMyNGJjOjY6OGFjMDo3NWU4MTQ3YzcxNTU4NjkyY2NlNTU3ZGMwZWNjYzlkYzJhNGY0MTVkZTZjY2M0NzY4ZmU2YmRlODgxOGZkMWIyOmg6VA . You are receiving this because your review was requested.Message ID: @.***>

-- PLEASE NOTE: The information contained in this message is privileged and confidential, and is intended only for the use of the individual to whom it is addressed and others who have been specifically authorized to receive it. If you are not the intended recipient, you are hereby notified that any dissemination, distribution, or copying of this communication is strictly prohibited. If you have received this communication in error, or if any problems occur with the transmission, please contact the sender.

mxposed commented 1 year ago

I don't think it's necessary to include this into the rescaling process, however I think this should be in results, marked as “baseline“ or smth else. Essentially, this is a description of datasets: percentage of the most popular label. Maybe we can remove this altogether after adding this info into dataset description.

codecov[bot] commented 1 year ago

Codecov Report

Base: 95.06% // Head: 95.06% // No change to project coverage :thumbsup:

Coverage data is based on head (9a91735) compared to base (a796e02). Patch has no changes to coverable lines.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #710 +/- ## ======================================= Coverage 95.06% 95.06% ======================================= Files 154 154 Lines 4072 4072 Branches 206 206 ======================================= Hits 3871 3871 Misses 131 131 Partials 70 70 ``` | Flag | Coverage Δ | | |---|---|---| | unittests | `95.06% <ø> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=openproblems-bio#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://codecov.io/gh/openproblems-bio/openproblems/pull/710?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=openproblems-bio) | Coverage Δ | | |---|---|---| | [...roblems/tasks/label\_projection/methods/baseline.py](https://codecov.io/gh/openproblems-bio/openproblems/pull/710/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=openproblems-bio#diff-b3BlbnByb2JsZW1zL3Rhc2tzL2xhYmVsX3Byb2plY3Rpb24vbWV0aG9kcy9iYXNlbGluZS5weQ==) | `100.00% <ø> (ø)` | | Help us with your feedback. Take ten seconds to tell us [how you rate us](https://about.codecov.io/nps?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=openproblems-bio). Have a feature suggestion? [Share it here.](https://app.codecov.io/gh/feedback/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=openproblems-bio)

:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

LuckyMD commented 1 year ago

@LuckyMD are you arguing for this to be included in the results, or included in the scaling computation? I don't think we should scale to something that we know is worse than random.

Sorry, let me clarify: I agree with you that this is not a baseline as we have it defined now. It should be included in the results and potentially flagged as a "simple baseline" or sth like that. That being said, I'd be surprised if this is worse than a random baseline for all scenarios.

scottgigante-immunai commented 1 year ago

For accuracy, it is not always worse, but sometimes. For F1, it's always worse (by definition, I think). I think the simplest solution here is to change its name to Majority Vote (naive). The more complex version involves implementing an additional flag which denotes baseline but not for scaling, e.g. is_baseline in [True, False, "display"]

github-actions[bot] commented 2 days ago

This pull request has been automatically closed because it has not had recent activity.