Lesionwise metrics for Competitions

rachitsaluja commented 2 months ago

N/A

Proposed Changes

Lesion-wise Metrics calculations for all Brats 2023 challenges
Lesion-wise Metrics calculations for all FeTS 2024 challenges

Simple run can be performed -

 gandlf_generateCompetitionMetrics -c "FeTS-2024" -i ./trial_gandlf.csv -o './metric_trial.yaml'

Checklist

[x] CONTRIBUTING guide has been followed.
[x] PR is based on the current GaNDLF master .
[x] Non-breaking change (does not break existing functionality): provide as many details as possible for any breaking change.
[x] Function/class source code documentation added/updated (ensure typing is used to provide type hints, including and not limited to using Optional if a variable has a pre-defined value).
[x] Code has been blacked for style consistency and linting.
[x] If applicable, version information has been updated in GANDLF/version.py.
[x] If adding a git submodule, add to list of exceptions for black styling in pyproject.toml file.
[x] Usage documentation has been updated, if appropriate.
[x] Tests added or modified to cover the changes; if coverage is reduced, please give explanation.
[x] If customized dependency installation is required (i.e., a separate pip install step is needed for PR to be functional), please ensure it is reflected in all the files that control the CI, namely: python-test.yml, and all docker files [1,2,3].

github-actions[bot] commented 2 months ago

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

sarthakpati commented 2 months ago

hi @rachitsaluja - I am going to mark this PR as draft. Please fix the failing tests, sign the CLA and then mark this PR as ready for review.

sarthakpati commented 2 months ago

Hi @rachitsaluja, you have indicated in your PR that the new additions have corresponding unit tests that cover all added code. However, the changed files do not indicate this:

Are you sure the unit tests have been updated to ensure that the new additions are invoked? Perhaps a commit push is missing somewhere? The coverage update should come up like this [ref]:

I am marking this PR has a draft until the tests have been added and everything is passing.

rachitsaluja commented 2 months ago

@sarthakpati , I have added my tests now but I am not sure why codecov is not invoked? I have not worked with codecov so I am unaware.

rachitsaluja commented 2 months ago

@VukW Thanks for helping me out today. The new tests in the workflow passed as well. Still not sure why codecov is not invoked. https://app.codecov.io/gh/mlcommons/GaNDLF/pull/866 Continues to show no information.

sarthakpati commented 2 months ago

This is extremely weird. I apologize for the confusion, @rachitsaluja.

To track this problem and have a potential solution, I have opened a discussion post with them here: https://github.com/codecov/feedback/discussions/368

Additionally, I just pushed a dummy commit on another PR (https://github.com/mlcommons/GaNDLF/pull/864/commits/8b928526e5c5cca6effdc46200c87594f6df009c) to test the code coverage.

rachitsaluja commented 2 months ago

@sarthakpati Thanks for the help! I appreciate it.

sarthakpati commented 2 months ago

hey @rachitsaluja, I still haven't heard from the codecov team about this. The test commit I did on the other PR worked as expected. Can I perhaps ask you to base your PR off of the new API branch instead of the current master? An additional reason behind this is that the current master API is due to be deprecated soon, and one of the major changes will be the way CLI apps are being invoked. I have a couple of additional suggestions to improve your PR:

Could you extend this function instead of writing something specifically for a challenge by putting a new use case (i.e., problem_type == "segmentation_brats")? This will negate the need for a separate script, as you can invoke this use case using the current generate metrics app. This will ensure clarity for future extensions/maintenance.
Also, could you put the actual metric calculations in this submodule? This will ensure that all metrics-related code stays in the same place, thereby improving code clarity.

rachitsaluja commented 2 months ago

@sarthakpati Thanks for you comments and your effort to work with codecov. Please find my comments below for your questions.

The test commit I did on the other PR worked as expected. Can I perhaps ask you to base your PR off of the new API branch instead of the current master?

I can base my PR of the new APIs branch. Could you send me the documentation for it? I don't know where all the CLI python code went compared to the master branch, or anything about this new branch.

Could you extend this function instead of writing something specifically for a challenge by putting a new use case (i.e., problem_type == "segmentation_brats")? This will negate the need for a separate script, as you can invoke this use case using the current generate metrics app. This will ensure clarity for future extensions/maintenance.

I could probably do this but I have concerns, the present generate_metrics.py needs a Configuration file which consists of a lot of other things that actually is not needed to compute the metrics. My script doesn't need a config file and just needs an argument of which competition's metrics we need to run.

Also, could you put the actual metric calculations in this submodule? This will ensure that all metrics-related code stays in the same place, thereby improving code clarity.

For this, I don't think integrating it within the same script is helpful to users. These "Lesion-wise" Metrics are carefully curated metrics with particular hyper-parameters for sub-challenges and future challenges and not for a normal use. And the sheer difference in metrics in challenges will be known this year again and I think it should be a separate entity.

I think we should think of the competition code as a separate entity just to be clear to the users. It makes it much easier for users who are participating in the competitions to use GaNDLF as it can be confusing for a few folks and this way they don't need learn the GaNDLF workflow to compute simple metrics.

I will be happy to discuss more.

sarthakpati commented 2 months ago

Replies are inline.

@sarthakpati Thanks for you comments and your effort to work with codecov. Please find my comments below for your questions.

The test commit I did on the other PR worked as expected. Can I perhaps ask you to base your PR off of the new API branch instead of the current master?

I can base my PR of the new APIs branch. Could you send me the documentation for it? I don't know where all the CLI python code went compared to the master branch, or anything about this new branch.

Great! There is no explicit documentation about the new API branch, yet. The branch itself is in https://github.com/mlcommons/GaNDLF/tree/new-apis_v0.1.0-dev. As I mentioned earlier, the major changes here are related to the CLI, and if you are extending the metrics submodule as suggested, you won't need to add a new CLI app.

Could you extend this function instead of writing something specifically for a challenge by putting a new use case (i.e., problem_type == "segmentation_brats")? This will negate the need for a separate script, as you can invoke this use case using the current generate metrics app. This will ensure clarity for future extensions/maintenance.

I could probably do this but I have concerns, the present generate_metrics.py needs a Configuration file which consists of a lot of other things that actually is not needed to compute the metrics. My script doesn't need a config file and just needs an argument of which competition's metrics we need to run.

I would suggest you keep in line with the interface that is already provided so that there is a uniform user experience. Otherwise, this will simply add to the maintenance responsibilities of the framework.

Also, could you put the actual metric calculations in this submodule? This will ensure that all metrics-related code stays in the same place, thereby improving code clarity.

For this, I don't think integrating it within the same script is helpful to users. These "Lesion-wise" Metrics are carefully curated metrics with particular hyper-parameters for sub-challenges and future challenges and not for a normal use. And the sheer difference in metrics in challenges will be known this year again and I think it should be a separate entity.

I understand that these metrics are for challenges, but even so, integrating them into a larger framework means that their maintenance and continued code support falls on the shoulders of the framework maintainers. For this reason, I would still recommend that you integrate it with the current generate metrics functionality. I think we should think of the competition code as a separate entity just to be clear to the users. It makes it much easier for users who are participating in the competitions to use GaNDLF as it can be confusing for a few folks and this way they don't need learn the GaNDLF workflow to compute simple metrics.

I will be happy to discuss more.

GaNDLF's metrics module is fairly independent of the training/inference aspects of the framework [ref], and a user need not learn that to use metrics. As a "compromise", might I suggest that if the config for contains problem_type: "segmentation_brats", no other config needs to be checked, and you can initialize parameters as needed under that specific use case? This will allow the code to be fairly independent while being callable through the metrics module. Thoughts?

sarthakpati commented 3 days ago

Pinging @rachitsaluja for any update.

mlcommons / GaNDLF

Lesionwise metrics for Competitions #866

Proposed Changes

Checklist