Corrected forward operations order

hongbozheng commented 2 months ago

Fixes #ISSUE_NUMBER

Proposed Changes

the comment states [input -- > in --> lrelu --> ConvDS --> output] but the operation that happens in the code is [input --> ConvDS --> activation --> in --> output], which is inconsistent
this PR makes the code consistent with the comment

Checklist

[ ] CONTRIBUTING guide has been followed.
[ ] PR is based on the current GaNDLF master .
[ ] Non-breaking change (does not break existing functionality): provide as many details as possible for any breaking change.
[ ] Function/class source code documentation added/updated (ensure typing is used to provide type hints, including and not limited to using Optional if a variable has a pre-defined value).
[ ] Code has been blacked for style consistency and linting.
[ ] If applicable, version information has been updated in GANDLF/version.py.
[ ] If adding a git submodule, add to list of exceptions for black styling in pyproject.toml file.
[ ] Usage documentation has been updated, if appropriate.
[ ] Tests added or modified to cover the changes; if coverage is reduced, please give explanation.
[ ] If customized dependency installation is required (i.e., a separate pip install step is needed for PR to be functional), please ensure it is reflected in all the files that control the CI, namely: python-test.yml, and all docker files [1,2,3].

github-actions[bot] commented 2 months ago

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

sarthakpati commented 2 months ago

Thank you for your contribution, @hongbozheng!

Since @Geeks-Sid wrote the initial module, I am requesting the review from him.

⚠️ ⚠️ ⚠️

This PR is going to break all backward compatibility with unet and derived classes with trained models. Thus, it needs detailed discussion.

hongbozheng commented 2 months ago

Also, if the operation order is changed to [input --> in --> lrelu --> ConvDS --> output], the initialization of norm layer in __init__

self.in_0 = norm(output_channels, **norm_kwargs)
                 ^^^^^^^^^^^^^^^

should be

self.in_0 = norm(input_channels, **norm_kwargs)
                 ^^^^^^^^^^^^^^

since norm now becomes the first layer.

sarthakpati commented 2 months ago

Ah, I understand. So, there is going to be a breaking of backward compatibility regardless of the direction. Anyway, I will wait for @Geeks-Sid to comment on this.

sarthakpati commented 2 months ago

Ah, I understand. So there is going to be a breaking backward compatibility regardless...

Anyway, @hongbozheng, you might need to take a look at the errors coming from the CI:

Geeks-Sid commented 2 months ago

@hongbozheng , Thank you for the PR. This makes sense. This still needs to pass the tests.

hongbozheng commented 2 months ago

Anyway, @hongbozheng, you might need to take a look at the errors coming from the CI:

https://github.com/mlcommons/GaNDLF/actions/runs/9863809854/job/27237387798?pr=897

https://github.com/mlcommons/GaNDLF/actions/runs/9863809860/job/27237387817?pr=897

I think the errors are caused by the issue I mentioned before, the norm layer initialization needs to be changed from

self.in_0 = norm(output_channels, **norm_kwargs)
                 ^^^^^^^^^^^^^^^

to

self.in_0 = norm(input_channels, **norm_kwargs)
                 ^^^^^^^^^^^^^^

as the error says ValueError: expected input's size at dim=1 to match num_features (64), but got: 32. which is exactly caused by the initialization of norm layer above.

The norm layer was initially the 3rd layer which performs norm after ConvDS (32 -> 64) that produces num_features=64, but it's now moved to the 1st layer which takes the num_features=32 directly from the Input.

I just made the change and it passed all the CI tests.

szmazurek commented 2 months ago

Hey there, jumping in with the few notes from my side - now the implementation indeed looks correct in terms of compatibility with the ones in the comment, also the errors were caused by what @hongbozheng correctly shows - wrong input dimension in the norm layer, as now it is applied at different stage. Overall LGTM.

sarthakpati commented 2 months ago

Awesome, thank you for your comments, @Geeks-Sid and @szmazurek!

@hongbozheng - have you already signed the MLCommons CLA?

hongbozheng commented 2 months ago

@hongbozheng - have you already signed the MLCommons CLA?

I did yesterday, but I got the below email reply.

Thank you for submitting your request to contribute code to MLCommons GitHub repositories with the GitHub ID "hongbozheng." You are not currently listed in the designated employees section of the CLA for University of Illinois. We have reached out to our primary contact for University of Illinois regarding adding you to the CLA and will follow up with you once we hear back.

sarthakpati commented 2 months ago

@hongbozheng - have you already signed the MLCommons CLA?

I did yesterday, but I got the below email reply.

Thank you for submitting your request to contribute code to MLCommons GitHub repositories with the GitHub ID "hongbozheng." You are not currently listed in the designated employees section of the CLA for University of Illinois. We have reached out to our primary contact for University of Illinois regarding adding you to the CLA and will follow up with you once we hear back.

Unless you are a paid employee of UI, you might need to ask them to updated this an "individual contributor"...

hongbozheng commented 2 months ago

@hongbozheng - have you already signed the MLCommons CLA?

I signed the CLA and joined the MLCommons GitHub organization. I'm not sure what to do next to pass the cla-check.

sarthakpati commented 2 months ago

@hongbozheng - have you already signed the MLCommons CLA?

I signed the CLA and joined the MLCommons GitHub organization. I'm not sure what to do next to pass the cla-check.

Thanks! I re-triggered the job and it passed. Welcome to the community! 😄

mlcommons / GaNDLF

Corrected forward operations order #897

Proposed Changes

Checklist