Enable InternalsVisibleTo Test Projects

benrr101 commented 5 months ago

One thing that I noticed while ramping up on MDS was that the MDS projects don't have InternalsVisibleTo assembly attributes to expose internal classes/methods/fields to the test projects. Although in an ideal world, we wouldn't expose the internal objects, but if we want to be able to have very thorough code coverage via granular unit tests, without making internal interfaces public, we will need this attribute enabled. As far as I can tell, this is generally accepted standard practice. However, I'd be curious to see the discussion on this.

At a minimum, I'd recommend:

Adding InternalsVisibleTo attributes to these projects:
- netcore/Microsoft.Data.SqlClient
- netfx/Microsoft.Data.SqlClient
Make the internals visible to these projects:
- Microsoft.Data.SqlClient.ManualTesting.Tests
- Microsoft.Data.SqlClient.Tests

Please let me know your thoughts!

saurabh500 commented 5 months ago

I think it makes sense to Add InternalsVisibleTo, for this project. There will always be parts of this project, which will require unit testing, and will never be a part of the public surface area of the Library. I looked at few other libraries which follow a similar technique for unit testing/benchmarking of internals etc. NpgSql being one of them https://github.com/search?q=repo%3Anpgsql%2Fnpgsql%20InternalsVisibleTo&type=code

Historical context: When this library was part of .Net Runtime repo as SDS, the guidance was to stay away from internals visible to in favor of making our APIs/public surface area written in a way that it is unittestable. However I dont see that as a practical way of unit testing SqlClient, because there is no reason for the internals to ever be exposed as a public API, e.g. Parser/StateObject etc.

I vote for making this change, but the internals visible to, should only make the internals visible to, our test projects, not everything.

JRahnama commented 5 months ago

One consideration when adding InternalsVisibleTo is maintenance. If we decide to improve or refactor parts of the code in the future, we might encounter extensive test failures due to the tight coupling between tests and internal implementation details. This could result in significant additional development time to fix the tests.

JRahnama commented 5 months ago

Also, developers may start exposing more internal details than necessary which can lead to a less clean and maintainable codebase. It might be tempting to make more internals visible just to simplify testing.

David-Engel commented 5 months ago

I'm not opposed to this. Obviously, it's still best to thoroughly and properly test the public methods, when possible. But it is very difficult to test some of the core inner workings of the driver and if this can get us better coverage to reduce regressions, it would be a good thing.

benrr101 commented 5 months ago

One consideration when adding InternalsVisibleTo is maintenance. If we decide to improve or refactor parts of the code in the future, we might encounter extensive test failures due to the tight coupling between tests and internal implementation details. This could result in significant additional development time to fix the tests.

Personally, I'm of the opinion that this is a feature rather than a bug, but there is definitely a balance to be found. I think we would not want to start exposing the privates of a component, but testing at a granular level allows you to build confidence at each level. My testing philosophy might be a bit heavily weighted towards one side of the spectrum, tho.

Anyways, I'm not planning to write a billion tests at this point, but having the option to back fill some unit tests related components are changed would be useful.

cheenamalhotra commented 5 months ago

I agree we should be able to unit test our classes, that will help us write better code with a more modular design. I agree with @saurabh500 this was a blocked practice before but with modern tooling requirements, this should be added.

saurabh500 commented 5 months ago

@JRahnama you have some valid points, however the tradeoff is that writing good, and perhaps complicated unit tests, allows the author of the change to focus on the methods being changed, rather than the big picture which is lot more complex and may be a cognitive overload. I would argue that it is lot more time consuming and strenuous to review code looking at the big picture, rather than reviewing the unit tests in isolation.

This also helps the authors of the perf test to test their changes in isolation.

Causing a unit test break, should be lot more thought provoking to see what could be the upstream impact of the change, rather than a tax to pay because of large refactoring. Also change in the unit test allows the code reviewer to look deeper into "what really changed" rather than waiting for a regression to be surfaced in the CI, with the complex distributed nature of the testing.

Yes, we need to have a balance of what is being exposed as internals vs what should be kept private. The Code reviewers should provide the feedback that too many internals are being exposed, instead structure the code to lot more testable from internals rather than making anything private. This then makes the author of the change rethink their code and its testability.

I foresee this change bring the following:

Big changes don't need to come in at once. Anyone making larger changes can send the changes iteratively, rather than implementing the big feature, by making sure that they don't break anything that exists and add coverage for their iterative changes even if they are internal.
There is active perf work in progress, so that author of perf changes can Benchmarks of internals rather than relying on figuring out how to hit the code paths using public APIs. At the same time, say things like changing a byte[] to Span to reduce on-heap allocations, can be tested much more easily. (too late for this since the contributors have already been through this pain)

Of course this doesn't mean that we don't do our integration tests. They will continue to be as important as they are now, but unit testing allows for a tighter loop of development.

saurabh500 commented 5 months ago

@benrr101 Would you like to own this. I think making InternalsVisibleTo, along with unit testing an internal would be the right way to go. Also it would be interesting to see if you have a thought in mind on what the namespace placement for unit tests should be? Should they match the class being tested? Or anything else that you have in mind?

Edit: NVM, I already see you have assigned the issue to your name :)

roji commented 5 months ago

FWIW I generally agree with @saurabh500... I do always go as far as possible in testing/benchmarking using public APIs only - that has the advantage of testing the actual user-facing API end-to-end (providing more test value/coverage), and not being sensitive to internal refactors etc. But there are specific cases where it does make sense. If you look at Npgsql, IIRC there are very very few tests using InternalsVisibleTo to cover internal types - but they do exist. It does require a bit of discipline, since sometimes it requires more effort to implement test coverage via the public test API; but IMHO it's almost always better to do that if that's possible.

On a very related note... Many of the cases where I originally thought I'd need to resort to unit testing, were cases which relied on the database sending back exceptional/weird results that can't be triggered in a normal test; for example, how does one test the database crashing in the middle of a command? We ended up doing a small "mock PostgreSQL" implementation in our tests; this allows us to write tests where we can exactly control what protocol messages are sent back by the database, and when. Of course, doing this is an investment (it wasn't as big as I thought), but it's really valuable and definitely better than trying to cover things via unit tests.

Wraith2 commented 5 months ago

I'm not sure it's a good idea. Since netcore doesn't have strong naming if you add internals visible to anyone can simply name their assembly as the test assembly and get access to the internals. However if they do that then they're silly and they can get the code and make their own version these days.

roji commented 5 months ago

Since netcore doesn't have strong naming if you add internals visible to anyone can simply name their assembly as the test assembly and get access to the internals. However if they do that then they're silly and they can get the code and make their own version these days.

They can also already use reflection to access any internal (or even private thing). public/internal/private really isn't about security or anything like that - it's just a matter of signalling to users which APIs should be used, but if users are intent on bypassing those signals, they can always do that anyway...

dotnet / SqlClient

Enable InternalsVisibleTo Test Projects #2532