Open oke-aditya opened 2 years ago
Thanks for opening this issue @oke-aditya
There are a few more things to consider, as loc
s don't give a full picture of a file's complexity.
Some of our tests files are divided into test classes, so long files are less of a problem in these ones. It's also worth noting that having all ops-related tests in a single test_ops.py
file makes (human) test discovery very, very easy. Splitting tests across multiple files might change that, especially considering we don't have a proper sub-folder structure in our test
dir. On top of that, most current "common utils" can be inlined in a single test file for now, but if we were to split files, we'd have to dispatch those into specific files - this too can be a bit messy without a decent folder structure.
Yes the idea is to get a proper sub folder structure in tests. Also the tests for now can be placed in sub folders but under a single file. We could have ops folder with single test file test_ops.py
(Eventually when time arises splitting it becomes easy)
tests
---- datasets
---- ops
------- test_ops.py
There are a few more things to consider, as locs don't give a full picture of a file's complexity.
Agreed. But sadly our editors and GitHub indexing etc won't agree :( I think the limit is 10k lines though, so we are pretty far.
Another problem of moving tests is breaking git blame
You actually had solved it :smile: https://github.com/pytorch/vision/blob/main/.git-blame-ignore-revs to the rescue?
I don't know if ignore-revs can be used for this? Git will have to know somehow that line a on file F corresponds to line b on file G. It can do that with entire file renamings/moving but I doubt it's possible for arbitrary content?
Maybe, this is experimental. My suggestion is let's actually try. Maybe on experimenting we will realise if this is really beneficial. Otherwise we would have a strong point to not go for this. As always there are advantages and disadvantages with anything we try :)
I agree with at @NicolasHug here. I don't think git blame
will pick up on this.
My suggestion is let's actually try. Maybe on experimenting we will realise if this is really beneficial.
Give it a try in your fork let us know here so we can have a look.
Motivation
Our tests are getting a bit lengthy and considering we are adding new features, we are actually placing multiple tests in single file. This leads to problems as the tests become harder to refactor and harder to read, manage. Standalone files for tests or grouped according to purpose are easier to anaylse and debug as well.
Our longest test files
Copy paste-able command
Many splits are possible that can make the tests smaller and easier to manage
Possibilities are many, we can have multiple files with better grouping.
Pitch
We should think of grouping tests in logically similar way together. or consider creating test folder a package. See https://github.com/pytorch/vision/pull/4436 . Earlier we have faced problems doing the same.
From discussion with Phillip offline Considering we don't Pin pytest. We can make use of https://github.com/pytest-dev/pytest/pull/9134
With the python path param, we can probably make use of
common_utils.py
file as well. So that we can test files without hurting other scenarios which were mentioned in the comments.Alternatives
This is quite tricky but very helpful as we don't want tests to hurt us later.
Reference
https://github.com/pytorch/vision/pull/4436#issuecomment-923131237
https://github.com/pytorch/vision/pull/4436#issuecomment-924005657
https://github.com/pytorch/vision/pull/4436#issuecomment-924250093
cc @NicolasHug @datumbox @pmeier