Closed Aakash-kaushik closed 3 years ago
Um so our pre trained models won't deserialize until we shift Sequential
from MoreTypes
to LayerTypes
in mlpack, so let me know what should we do in this regard ?
Yes, the idea is to create a small dataset (refer voc dataset on mlpack.org), and download weights then load them into the model and predictions should match with pytorch.
Hey, i was going to write tests but isin't voc for sgemenation and detection ? Maybe there could be a smaller version of image net we could use, I also see one in the converter repo but not sure how to load that up in torch and what class they relate to
Are you also adding tests for Darknet and YOLO?
Are you also adding tests for Darknet and YOLO?
No i just adapated them to the new format, I added a convenience function and so edited them the same way, less repeated code this way.
Could you share the link to change
Could you share the link to change
Just tagged the changes in the review.
Also @kartikdutt18 will it be fine if rather than using a dataset we can just have a matrix of ones passed to pytroch and our models and then compare the sum of the linear layer ? seems like a reliable test to me.
Sounds good. Just make sure there are tests for multiple and single batch size.
Sounds good.
Awesome shall add them in a bit I believe you can start your final review after that. :)
The passing tests should fail, I wrote size_t
in place of double
so it was comparing a float
to 0
and still shows the tests as passed in preTrainedmodel tests, either that is because i wrapped them in a function or they are run by ctests but i disregard the first point because running them directly throught the catch executable results them being failed even when they are wrapped in the function but not when in ctest.
I didn't really get this part. If possible, could you please elaborate it a bit more.
I didn't really get this part. If possible, could you please elaborate it a bit more.
Yeah sorry, i realise that it was a bit vague.
So before the last push i used size_t
for the function that compares double
values so just because of that it was comparing some number like 0.0032323
to 0
and this was passing when being runned with ctest rather than directly executing the catch2
executable which is models_test
.
But this issue was highlighted when i directly ran them through the executable as i said above but when i looked at the PR build info the tests were passing and specifically this one, which leads me to believe that this way we can miss tests that fail but are still shown to pass.
@kartikdutt18 not sure but does this makes sense ?
Also @zoq and @kartikdutt18 I believe the PR is ready for a review now.
@zoq apart from the above commented three things i have applied all the suggestions.
So i believe I don't have any major work for resnet now, and i can move to mobilenet v1 and how i am going to implement that ?
Left minor comments, will do complete review tomorrow (evening-ish).
Sure, i will make the changes together.
The preTrained test are accounted as passed but in reality they are failing, you can see the runtime they have, and just above them an actual resnet that runs and takes a huge time a pretrained version should take the same but it doesn't but ctest
still reports them to be passed.
I have pasted the test results below:
Start 4: ResNetModelTest
4/21 Test #4: ResNetModelTest .......................... Passed 81.79 sec
Start 5: ResNet101ModelTest
5/21 Test #5: ResNet101ModelTest ....................... Passed 79.93 sec
Start 6: ResNet152ModelTest
6/21 Test #6: ResNet152ModelTest ....................... Passed 149.89 sec
Start 7: PreTrainedResNetModelTest
7/21 Test #7: PreTrainedResNetModelTest ................ Passed 0.98 sec
Start 8: PreTrainedResNet101ModelTest
8/21 Test #8: PreTrainedResNet101ModelTest ............. Passed 0.57 sec
Start 9: PreTrainedResNetModel152Test
9/21 Test #9: PreTrainedResNetModel152Test ............. Passed 0.58 sec
And a local test run to show what's wrong:
UpdateCTestConfiguration from :/home/aakash/models/build/tests/DartConfiguration.tcl
Parse Config file:/home/aakash/models/build/tests/DartConfiguration.tcl
UpdateCTestConfiguration from :/home/aakash/models/build/tests/DartConfiguration.tcl
Parse Config file:/home/aakash/models/build/tests/DartConfiguration.tcl
Test project /home/aakash/models/build/tests
Constructing a list of tests
Done constructing a list of tests
Updating test list for fixtures
Added 0 tests to meet fixture requirements
Checking test dependency graph...
Checking test dependency graph end
test 8
Start 8: PreTrainedResNet101ModelTest
8: Test command: /home/aakash/models/build/bin/models_test "PreTrainedResNet101ModelTest"
8: Test timeout computed to be: 1500
8: Filters: PreTrainedResNet101ModelTest
8:
8: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
8: models_test is a Catch v2.13.3 host application.
8: Run with -? for options
8:
8: -------------------------------------------------------------------------------
8: PreTrainedResNet101ModelTest
8: -------------------------------------------------------------------------------
8: /home/aakash/models/tests/ffn_model_tests.cpp:175
8: ...............................................................................
8:
8: /home/aakash/models/tests/ffn_model_tests.cpp:175: FAILED:
8: due to unexpected exception with message:
8: Invalid 'which' selector whendeserializing boost::variant
8:
8: ===============================================================================
8: test cases: 1 | 1 failed
8: assertions: 1 | 1 failed
8:
1/1 Test #8: PreTrainedResNet101ModelTest ..... Passed 3.09 sec
The following tests passed:
PreTrainedResNet101ModelTest
100% tests passed, 0 tests failed out of 1
Total Test time (real) = 3.10 sec
The reason for them failing is because mlpack is installed as a step in the build process and that doesn't have sequential
type in LayerType
so deserialization fails.
Right now we have two issues:
main.cpp
in tests dir always returns 0
- check here for reference this returns the number of failed tests.Then lets open a PR in mlpack to move the required layer to LayerTypes. Should resolve the issue.
Then lets open a PR in mlpack to move the required layer to LayerTypes. Should resolve the issue.
Sure, would create one. Also about the requested changes i see i have resisted a bit of them but that's just because keeping them this way allows the code to be more simpler and have less nmber just lying around the whole code that won't makes sense.
Then lets open a PR in mlpack to move the required layer to LayerTypes. Should resolve the issue.
@kartikdutt18, @zoq which exact layers should i replace ? i mean i know we need to put sequential
layers into LayerType
but which two layers should go to MoreType
.
So it makes sense in terms of complete mlpack and not just this PR.
You could move FlexibleReLU
and SpatialDropout
.
You could move
FlexibleReLU
andSpatialDropout
.
Thanks, Can you also help me identify what causes these tests to just pass even when they are failing ?
@kartikdutt18 I have pushed most of the suggestions and left some comments. can you take a look ?
Looks like I missed to click the submit review button.
Looks like I missed to click the submit review button.
No worries, btw i pushed all the requested changes.
@kartikdutt18 I have pushed most of the suggestions and left some comments. can you take a look ?
Sure, I will go through it in a bit.
I believe this wraps up the changes from my side, A final review would be awesome.
Kinda excited to see this getting merged. :smile: Thanks @kartikdutt18 and @zoq for bearning with me on this one. We have one more to go :p
Extension of #61 and a PR for review of the Additon of resnet module.