mlcommons / GaNDLF

A generalizable application framework for segmentation, regression, and classification using PyTorch
https://gandlf.org
Apache License 2.0
150 stars 78 forks source link

Made train batches order same as gt #870

Closed VukW closed 2 months ago

github-actions[bot] commented 2 months ago

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

codecov[bot] commented 2 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 95.11%. Comparing base (bb8821d) to head (f1b951a). Report is 1 commits behind head on master.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #870 +/- ## ======================================= Coverage 95.11% 95.11% ======================================= Files 122 122 Lines 8326 8326 ======================================= Hits 7919 7919 Misses 407 407 ``` | [Flag](https://app.codecov.io/gh/mlcommons/GaNDLF/pull/870/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=mlcommons) | Coverage Δ | | |---|---|---| | [unittests](https://app.codecov.io/gh/mlcommons/GaNDLF/pull/870/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=mlcommons) | `95.11% <ø> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=mlcommons#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

Geeks-Sid commented 2 months ago

@sarthakpati This PR is not acceptable as it does not provide the minimum requirements and no reasonable explanation for this change. @VukW Please provide a degree of explanation as to why this PR is needed and the following (explanation)[https://datascience.stackexchange.com/questions/24511/why-should-the-data-be-shuffled-for-machine-learning-tasks] is not valid.

sarthakpati commented 2 months ago

This is related to #868. Basically, there is a bug with the training pipeline, where the data and ground truth label are getting shuffled separately. Will post a full report soon.

VukW commented 2 months ago

Turning shuffle back on in the PR with fix: https://github.com/mlcommons/GaNDLF/pull/868/commits/71273cee582671ca05b39c969bf07488da6397ca