bazelbuild / rules_scala

Scala rules for Bazel
Apache License 2.0
358 stars 265 forks source link

Stricter unused dependency checking for plus one deps mode #867

Open Jamie5 opened 4 years ago

Jamie5 commented 4 years ago

When using plus one deps mode, many unused deps do get marked, but some deps can be left out because they are already the dep of another dep, and it also makes sense to leave them out because they never appear in the source code of the package being compiled, and the only reason the dep is needed is to make scalac happy. Would it be possible to have a unused deps mode for plus one deps mode where a dep is marked as an unused, unless it is explicitly referenced in the source code? (Not sure if this would end up with a number of false positives - not that familiar with scalac's needs)

This is now in progress, the below summarizes the status

Known potential issues

Things to do

ittaiz commented 4 years ago

Thanks for the issue! TLDR: Unfortunately not.

Longer: The current unused deps mode uses a heuristic which works sometimes (in Wix’s experience less, in Stripe’s experience more). It’s not source based but rather uses some internal accounting scalac has (which has both false positives and false negatives).

Can something be done?

  1. Develop a different unused deps mechanism which is source based as part of the build- I’m not sure how complicated this is to be honest. Java Bazel people have something in the Bazel repo which we looked at in the past but haven’t been able to get anywhere with. My initial plan with the +1 was to have something like this (as a separate action which will fail the build but doesn’t block the compilation graph).

  2. Have an IntelliJ based unused deps source based mechanism- since IntelliJ does many of the heavy lifting AST wise then it seems easier to have some sort of IJ Inspection that looks at the entire target and the sources and removed deps. I think this won’t be cheap perf wise but is an interesting approach.

  3. Drop and then recreate pattern- with buildozer you can drop all deps in one command and if you then have a tool which automatically adds dependencies you can use this pattern every X times to clean up. We (Wix) have one such tool which we plan on open sourcing (tentatively Q1) and are working on another one inside of our IJ plugin.

Would love to hear your thoughts

On Fri, 1 Nov 2019 at 21:36 Jamie5 notifications@github.com wrote:

When using plus one deps mode, many unused deps do get marked, but some deps can be left out because they are already the dep of another dep, and it also makes sense to leave them out because they never appear in the source code of the package being compiled, and the only reason the dep is needed is to make scalac happy. Would it be possible to have a unused deps mode for plus one deps mode where a dep is marked as an unused, unless it is explicitly referenced in the source code? (Not sure if this would end up with a number of false positives - not that familiar with scalac's needs)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/bazelbuild/rules_scala/issues/867?email_source=notifications&email_token=AAKQQFY7W6JE5NVLAYRSOKDQRSAKHA5CNFSM4JH7D3J2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HWGHTRA, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKQQF53MIXENSTDWEJSLRDQRSAKHANCNFSM4JH7D3JQ .

Jamie5 commented 4 years ago

Re 1, I don't think walking over the tree and finding used types would be very hard - the unused deps mechanism already does something similar (though they walk over different things). The tricky part may be to understand, when you see a type, should it be a direct dep? Though that might not be tricky at all and actually be really straightforward.

For 3, that seems useful though 1 feels more preferable. Also wouldn't this potentially remove direct dependencies that you also happen to depend on in a +1 situation, which would violate the desire of strict-deps?

ittaiz commented 4 years ago

Re 3- you nailed it on both problems this brings. Our existing solution is indeed not a very good one but better than letting build files rot. The new mechanism in the plugin won’t solve the need for user activation but will be source based so should solve the +1 issue.

Re 1- how do you suggest to do this? Our main goal was to have a small to zero overhead for unused deps mechanism. Performing this iteration on the scalac action (as a plugin for example) was deemed costly by people more familiar with scalac than me. Performing this in a separate action has a big cost resources wise even though you might not increase the effective build time. This is the main reason we went with the current heuristic which capitalized on rough information scalac already collects. Another thought was to work with the zinc people to extract their analysis module to be more decoupled and then depend on that but it required bandwidth we didn’t have.

On Sat, 2 Nov 2019 at 21:23 Jamie5 notifications@github.com wrote:

Re 1, I don't think walking over the tree and finding used types would be very hard - the unused deps mechanism already does something similar (though they walk over different things). The tricky part may be to understand, when you see a type, should it be a direct dep? Though that might not be tricky at all and actually be really straightforward.

For 3, that seems useful though 1 feels more preferable. Also wouldn't this potentially remove direct dependencies that you also happen to depend on in a +1 situation, which would violate the desire of strict-deps?

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/bazelbuild/rules_scala/issues/867?email_source=notifications&email_token=AAKQQFYY6QZ635EWXBJOAS3QRXHRHA5CNFSM4JH7D3J2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEC5DCCQ#issuecomment-549073162, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKQQF6362WZSHZWZBIU74DQRXHRHANCNFSM4JH7D3JQ .

Jamie5 commented 4 years ago

Yes, agreed that 3 is better than nothing, even with the problem of non-strict deps.

For 1, IME iterating over the tree as a plugin was not overly expensive (though non-zero). But certainly I'm not that experienced with scalac, nor the cost of the specific operations needed for this. From my understanding the existing strict-deps mechanism already iterates over the full AST and pays some reasonable amount of cost for it.

Actually is there a reason the existing strict-deps mechanism can't give the list of directly-referenced deps (from my understanding, it should be able to do that) which we can use to find the unneeded ones?

ittaiz commented 4 years ago

Have you read the code of the existing strict deps mechanism? It doesn’t do any iteration over the AST but rather just take the list of jars scalac needed to load.

If you can do it via a plugin then maybe do it externally and measure the cost?

On Sat, 2 Nov 2019 at 22:38 Jamie5 notifications@github.com wrote:

Yes, agreed that 3 is better than nothing, even with the problem of non-strict deps.

For 1, IME iterating over the tree as a plugin was not overly expensive (though non-zero). But certainly I'm not that experienced with scalac, nor the cost of the specific operations needed for this. From my understanding the existing strict-deps mechanism already iterates over the full AST and pays some reasonable amount of cost for it.

Actually is there a reason the existing strict-deps mechanism can't give the list of directly-referenced deps (from my understanding, it should be able to do that) which we can use to find the unneeded ones?

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/bazelbuild/rules_scala/issues/867?email_source=notifications&email_token=AAKQQF7VM2I24LXEPQS4MKDQRXQKXA5CNFSM4JH7D3J2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEC5EQ3Y#issuecomment-549079151, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKQQF4KQK3RYL2ETKVGRPLQRXQKXANCNFSM4JH7D3JQ .

Jamie5 commented 4 years ago

Ah I see, I misread the code. It does seem to use the native java strict_deps but only for java files. Which does appear to iterate over the AST. Now I see that the strict_deps for scala files does not do that.

Just to be sure, is https://github.com/bazelbuild/rules_scala/blob/master/third_party/dependency_analyzer/src/main/io/bazel/rulesscala/dependencyanalyzer/DependencyAnalyzer.scala what handles strict deps or am I misreading again? That one does appear to do very limited iteration over the AST in a way I don't fully follow.

Hmm maybe will try that out and see, it would hopefully answer the question easily enough. Is there a particular big bazel-ified codebase you would recommend?

ittaiz commented 4 years ago

None of the needed combo (OSS+Scala+Bazel). Do you work for a company that uses Bazel and Scala? Maybe you can time it on an internal codebase. If diff is small enough we can continue the discussion (we can time it on our codebase as well)

On Sat, 2 Nov 2019 at 23:19 Jamie5 notifications@github.com wrote:

Ah I see, I misread the code. It does seem to use the native java strict_deps but only for java files. Which does appear to iterate over the AST. Now I see that the strict_deps for scala files does not do that.

Just to be sure, is https://github.com/bazelbuild/rules_scala/blob/master/third_party/dependency_analyzer/src/main/io/bazel/rulesscala/dependencyanalyzer/DependencyAnalyzer.scala what handles strict deps or am I misreading again? That one does appear to do very limited iteration over the AST in a way I don't fully follow.

Hmm maybe will try that out and see, it would hopefully answer the question easily enough. Is there a particular big bazel-ified codebase you would recommend?

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/bazelbuild/rules_scala/issues/867?email_source=notifications&email_token=AAKQQF5CFZAACN6MLU4VCODQRXVGHA5CNFSM4JH7D3J2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEC5FIPY#issuecomment-549082175, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKQQF27DN2A3SVVH5F4DLTQRXVGHANCNFSM4JH7D3JQ .

Jamie5 commented 4 years ago

Ok https://github.com/Jamie5/rules_scala/commit/cbba543c1e4a6c1174043d9dd5b4cd952bdc03b4 has a diff which is rather hacky (at least in terms of some plumbing) but does appear to do what we want.

Run on a 2.12.8 codebase, some of the rules testing old vs new unused dependency checker were as follows. Note that while testing methodology was probably fairly reasonable, it would be far from airtight. The results suggest that timing is not significantly different, but if you have good infra for timing it would probably have more reliable results.

Rule 1 New: 169.07, 166.10, 165.01 => 166.73 Old: 168.59, 160.43, 164.30 => 164.44

Rule 2 New: 22.06, 22.18, 25.65, 25.45 => 23.84 Old: 22.39, 21.79, 25.50, 25.84 => 23.88

Rule 3 New: 19.56, 19.56, 19.81, 19.28 => 19.55 Old: 19.80, 19.21, 19.85, 20.29 => 19.78

Some notes

Jamie5 commented 4 years ago

@ittaiz did you get a chance to look at this?

ittaiz commented 4 years ago

No I'm sorry. This is really interesting to me but I'm a bit under capacity trying to wrap my head around the refactor PR. I'll do my best to get to it in the next few days, ok?

On Wed, Nov 13, 2019 at 8:18 PM Jamie5 notifications@github.com wrote:

@ittaiz https://github.com/ittaiz did you get a chance to look at this?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/bazelbuild/rules_scala/issues/867?email_source=notifications&email_token=AAKQQF4GWJ3YM7L4BUXMEWLQTRAG5A5CNFSM4JH7D3J2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOED7EFZA#issuecomment-553534180, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKQQF7325EZLJRHJ52OGC3QTRAG5ANCNFSM4JH7D3JQ .

Jamie5 commented 4 years ago

Sounds good, no worries I just never know if something is in someone's queue or got lost in the notification void.

ittaiz commented 4 years ago

Fair enough, unfortunately we have so many notifications it does happen sometimes. Please feel free to ping me again mid of next week if I don't respond.

On Wed, Nov 13, 2019 at 11:27 PM Jamie5 notifications@github.com wrote:

Sounds good, no worries I just never know if something is in someone's queue or got lost in the notification void.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/bazelbuild/rules_scala/issues/867?email_source=notifications&email_token=AAKQQFZ73SKZHMUAZHMFT7TQTRWNZA5CNFSM4JH7D3J2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOED7XBCA#issuecomment-553611400, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKQQFZTCE66EGMA54GTPFLQTRWNZANCNFSM4JH7D3JQ .

Jamie5 commented 4 years ago

One other random thought, if this works could a similar mechanism be used to not do plus one deps but examine only the ijar and determine which deps are actually needed to compile the ijar, and propogate only those ones? (IIRC I read somewhere java rules did something similar like that but not sure about it).

And maybe strict deps can stem from this as well, the mechanisms seem similar at least.

ittaiz commented 4 years ago

Potentially. I think that there are more nuances in this space so work will need to be iterative but a strong enough tool can definitely help us improve things. I encourage you to dive a bit more into how java rules work since they are very advanced in these areas.

Jamie5 commented 4 years ago

@ittaiz did you get a chance to take a look?

ittaiz commented 4 years ago

I've re-read the thread and I think I originally replied to what I read and not what you wrote :)

To align our discussion let's agree on the issues:

To mitigate these issues rules_scala currently has 3 strategies which impact the classpath:

  1. Direct only- 1.1. Pros- Performance. 1.2. Cons- 1.2.1. scalac issues mentioned above. 1.2.2. scalac error messages are often less than helpful and definitely don't allow fluent work since devs need to translate them to labels.
  2. Plus one (direct deps and their direct deps)- 2.1. Pros- 2.1.1. Performance (smaller classpath than 3) 2.1.2. Less false positives than 1 2.2. Cons- 2.2.1. Targets can use the +1 deps as direct deps without noticing 2.2.2. Targets can have unused deps which are used as +1 deps and not get notifications 2.2.3. Heuristic based: +1 catches many cases of 2.2.4. scalac error messages for missing deps
  3. Strict deps (all deps appear on classpath)- 3.1. Pros- nice label based error messages 3.2 Cons- 3.2.1. Performance. 3.2.2. False positives (related to the strict-deps plugin which aims to have small overhead).

Note all of the above tackle the hygiene issue even if in different ways and tradeoffs.

Do I understand correct that you want to solve 2.2.1. and maybe 2.2.2. while still having nice label based errors? I think you hinted towards maybe trying to tackle 2.2.3. later on which is also great. If so I'm indeed very much in favor.

I took a look at the code and it's nice! If we'll decide to merge it in we'll of course need to clean it up a bit like you mentioned but my two main concerns are still performance and false positives. I'm trying to think how to validate them. One option might be to run this against Wix's codebase but that will take me some time since we're currently using +1 and some errors will be correct and some (hopefully few to none) will be false. For performance it might be easier but still require some time.

Can you see these +1 tests still pass with the flag turned on?

To summarize (and assuming I understand correctly what you're trying to solve):

  1. How urgent is this for you? I'm very excited about the potential but due to impact would like more time to evaluate it.
  2. Do you want to add strict-deps based on it? Seems like it's a very small addition to the above commit
  3. Do you have any thoughts/suggestions on how we can validate it apart from Wix's codebase? (It's just very large, many different patterns of scalac and deps). I'd probably still like to run it on Wix's codebase but if we get good strong results from other places maybe I can run on a few select repos internally.

Thank you for your efforts here! I think that if we'll be able to polish it product, performance and false positives wise this will be a very big jump forward!

Note- Direct only does solve the above problems while introducing others mentioned above.

Jamie5 commented 4 years ago

I guess SCP-009 has not made much progress since? Because if it did and they can backport it to 2.12.x or even just 2.13.x then that would be great and this becomes much less important IIUC.

This diff would tackle 2.2.2, and if strict-deps was added (which I will try out) then it will also tackle 2.2.1. I don't fully follow what 2.2.3 and 2.2.4 are.

If 2.2.3 means that some unneeded deps are captured by the +1 then in theory this might be able to help but it might be easier to do it by examining the ijar directly after it is produced, or something like that. Because otherwise we need to make assumptions about what exactly the ijar keeps and doesn't keep (which I guess might not be that complicated). But this would definitely be very up in the air and with very unclear feasibility/correctness/timelines.

If I understand correctly, to run the tests you specified one only needs to do ./test_rules_scala.sh. Having done that, it looks like everything says successful.

Regarding your last questions

  1. Currently we are using +1 deps, and I have used this to manually clean up unused deps, and that is not the worst world to be in (especially if it can also have strict deps working). That being said, this is a rather manual mode of operation so would perfer not to be in it for overly long. But definitely it is worthwhile to make sure that we can get things pretty right.
  2. Will try it and see.
  3. Not really sure, any public big codebase that uses scala and bazel I guess but don't know of any. As far as false positives, we might just need to do a whack-a-mole thing (luckily, usually the issues are easy to simplify and repro and then the solution is clear, except for the literal issue mentioned). One issue is validating on all the supported versions of scala, which hopefully we can automate easily enough.
ittaiz commented 4 years ago

SCP-009 made some progress and this is how we built the strict deps mechanism. We copied this work into rules_scala and adapted it into bazel. It’s too simplistic however...

Never mind 2.2.3/2.2.4 for now

Re the tests- the support you added works only if the user turns on both unused deps and plus 1, no? Because the tests I linked to only turn on plus 1 (you can modify them to only turn on unused deps and see)

Jamie5 commented 4 years ago

Ok, https://github.com/Jamie5/rules_scala/commit/8724a32b45a7be02ec66aa64bb71fdd3d573fe9f#diff-3830e6e26d863974d38e511b04761916 has code for unused_deps as well.

Notes

As for potential small steps while validating the overall things

ittaiz commented 4 years ago

Thanks! A few thoughts just from reading your message (haven't looked at the code yet):

There is no way in this diff to disable strict deps checking - if you have unused deps check on (in plus one mode), then strict deps checking will be too

Completely fine for POC. When we'll want to merge it we'll need to consider if it's ok or not.

This includes all transitive dependencies to populate indirect_deps - we shouldn't need to do that (and can just go +1) but it was easiest to copy the non-plus-one strict_deps logic.

Again for POC fine, for merge we'll need the +1.

There is an issue, where if A calls a method which has a default parameter of type B, but A doesn't provide a value for B (and lets the default stand), the strict deps checker still claims that A depends on B. Arguably this can be the correct behavior, but it also seems strange that we would report such. But it was tricky to find out a way to ignore this case (there were some potential possibilities but with unclear consequences) so for now it is left as claiming that we need the dep directly.

From our experience working with the existing strict-deps for a long time (6 months? 1 year?) on a very large codebase and many developers is that outputting unclear errors (to the developer) is super harmful. People started saying they need to "please the bazel beast". This was one of the main reasons why we moved to +1.
I'd like to error to the side of not reporting transitively used than reporting transitive deps which aren't used. Maybe this needs to be configured on level of strictness (if not too complicated API wise). This is also why I think it's really important to run this on a large codebase. I'll try to see if I can get two people inside of Wix to spend a few days just analyzing the results of running this internally and analyzing the errors. The false ratio needs to be close to zero IMHO.

Re code duplication- I agree. I'd probably prefer this be in separate commits to ease review.

Jamie5 commented 4 years ago

Completely fine for POC. When we'll want to merge it we'll need to consider if it's ok or not. Again for POC fine, for merge we'll need the +1.

Agreed on both counts,

I'd like to error to the side of not reporting transitively used than reporting transitive deps which aren't used. Maybe this needs to be configured on level of strictness (if not too complicated API wise). This is also why I think it's really important to run this on a large codebase.

Fine with me, we can hammer the issues as we discover them. Would want to look at potential approaches for this, have some ideas but maybe some compiler expert knows the actual correct way to do things.

One thing is that as we do more of this then the risk of breaking on different scala versions matters more and it would be useful to run the unit tests against all supported scala versions, not sure if there is already some mechanism to do that. (We already have this issue with final vals which are another false positive as mentioned above)

Re code duplication- I agree. I'd probably prefer this be in separate commits to ease review.

Agreed, would prefer to merge in small chunks that don't break things. I can look at the initial steps here if we gain confidence on the overall idea.

Jamie5 commented 4 years ago

@ittiaz just to make sure, you are not waiting for anything from me in order to test right? (want to make sure we are not both thinking we are waiting on something from the other and hence nothing ever happens)

ittaiz commented 4 years ago

Indeed. Sorry for the silence, was sick and in BazelCon. Yes, ball is in my court and I'm trying to find someone in Wix that will run with your diff and analyze impact (functionally wise).

ittaiz commented 4 years ago

The amazing @anchlovi is taking this week to run it on some large codebases internally

anchlovi commented 4 years ago

Just started my tests and there is an issue with external source repos. I'm not sure that the unused deps tool should test external source repos targets. Also buildozer (at least the version I'm running - 0.29.0) can not handle external source repos

ittaiz commented 4 years ago

I think this is a bazel wide issue and not this tool specifically. This is because Bazel treats everything as a mono repo (once fetch finishes). The pattern I think we'd need to use it to run with "warn" mode for alignment and then switch to error. WDYT?

On Sun, Dec 22, 2019 at 10:52 AM Shachar Anchelovich < notifications@github.com> wrote:

Just started my tests and there is an issue with external source repos. I'm not sure that the unused deps tool should test external source repos targets. Also buildozer (at least the version I'm running - 0.29.0) can not handle external source repos

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/bazelbuild/rules_scala/issues/867?email_source=notifications&email_token=AAKQQF7IRR2CK5QNVNXCBFLQZ4TC7A5CNFSM4JH7D3J2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHPLKBY#issuecomment-568243463, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKQQF3TIL2YTEAPQVXIGRDQZ4TC7ANCNFSM4JH7D3JQ .

--

Ittai Zeidman

Cell: 054-6735021

40 Hanamal street, Tel Aviv, Israel

http://www.wix.com

anchlovi commented 4 years ago

It will work for our use case

ittaiz commented 4 years ago

Don't you think it can work for all use-cases? Or do you mean if you use something you don't control like rules_scala? Because that is indeed a known issue in the bazel ecosystem

On Sun, Dec 22, 2019 at 10:59 AM Shachar Anchelovich < notifications@github.com> wrote:

It will work for our use case

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/bazelbuild/rules_scala/issues/867?email_source=notifications&email_token=AAKQQFZP5WR3RQD6HDM7YA3QZ4T7BA5CNFSM4JH7D3J2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHPLNQQ#issuecomment-568243906, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKQQF6PAYATJE554BPGUT3QZ4T7BANCNFSM4JH7D3JQ .

--

Ittai Zeidman

40 Hanamal street, Tel Aviv, Israel

http://www.wix.com

anchlovi commented 4 years ago

Exactly, your suggestion to start with warn and then to move to error will work as long as you control all external repos

ittaiz commented 4 years ago

👍 let's continue then. Mainly since in Bazel it's very easy to control external source repos if you must (fork them)

anchlovi commented 4 years ago

Current implementation only analyzes direct dependencies and can only work with strict deps mode. This might break repos using the +1 deps (as it happened during my tests), e.g.: A depends B C depends on A but doesn’t use any of A classes but it is using B classes In this case the current impl will mark A as unused but removing A will break C. In such cases the tool should print 2 warnings, one for removing A and second for adding B

Jamie5 commented 4 years ago

So currently, the tool doesn't print both warnings? IIRC it should but don't fully recall. https://github.com/Jamie5/rules_scala/commit/8724a32b45a7be02ec66aa64bb71fdd3d573fe9f is the diff you are testing, right? (an earlier one did not have any strict_deps checks in plus one mode, while this one always has strict deps on). Have you seen any strict deps errors at all?

anchlovi commented 4 years ago

So currently, the tool doesn't print both warnings? IIRC it should but don't fully recall. Jamie5@8724a32 is the diff you are testing, right? (an earlier one did not have any strict_deps checks in plus one mode, while this one always has strict deps on). Have you seen any strict deps errors at all?

Here is a repo when this issue can be reproduced: https://github.com/anchlovi/unused-deps-1

I'll try to find more issue like this

anchlovi commented 4 years ago

Here is another example of a false-positive: https://github.com/anchlovi/unused-deps-specs2

Jamie5 commented 4 years ago

Re https://github.com/anchlovi/unused-deps-1

So this is rather strange

If we change B3 to the following


public class B3 {
    public static int A = 42;
}

then no error is emitted. However, adding final to A makes the error appear again. This hence appears to be related to constant folding, mentioned in https://github.com/bazelbuild/rules_scala/issues/867#issuecomment-551233241

However this is not the end of it.

If instead of changing B3, we change B2 to


import com.comp.b3._

class B2 {
    def foo(x: B3): Unit = {}
}

then no error is reported. But! if we change to


import com.comp.b3._

class B2 {
    def foo(x: B3): Unit = {}
    val x = B3.A
}

then again an error is reported.

As best as I can tell, the B3 in the latter snippet has its associatedFile as NoAbstractFile for very unclear reasons.

However, if we change the scala version to 2.12.8, then in the last snippet, the code does compile fine without any unused dependency warning.

So this first repo seems to surface the following issues

I am not entirely sure what is 2.11 EOL, is there a point in which we can just say that in 2.11 you are subject to false positives and stuff because the AST isn't as mature as 2.12+?

Jamie5 commented 4 years ago

Re https://github.com/anchlovi/unused-deps-specs2 IIUC then this is the situation you were discussing earlier around external deps. Because looking at the errors, it seems like io_bazel_rules_scala_org_specs2_specs2_matcher depends on io_bazel_rules_scala_org_specs2_specs2_common and io_bazel_rules_scala_org_specs2_specs2_fp and if it were not an external repo then those dependencies would be imported via +1. But io_bazel_rules_scala_org_specs2_specs2_matcher is external and hence doesn't include io_bazel_rules_scala_org_specs2_specs2_common or io_bazel_rules_scala_org_specs2_specs2_fp in its deps.

So to handle this

ittaiz commented 4 years ago

constant folding

Is there something we can do about this? From my rich experience with the existing mechanism over a large codebase and ~300 developers the errors have to make sense

some parsing strangeness that happens in 2.11.12 but not in 2.12.8.

@anchlovi tested this on 2.12.6 which is what we’re running with internally. I think targeting 2.12.8 and after is valid. I guess that after Christmas vacations we can take a look at upgrading to 2.12.8

Jamie5 commented 4 years ago

Is there something we can do about this? From my rich experience with the existing mechanism over a large codebase and ~300 developers the errors have to make sense

Related: https://github.com/scala/bug/issues/7173 and https://github.com/scala/scala/commit/2e9a5853e9886fd76f7a5c78a9df0b16a7d5f74e

Using OriginalTreeAttachment does appear to work at least in the scenario provided (and it doesn't seem like there is a reason it wouldn't work in general). The issue is that it appears to have been added in 2.12.4 and all of 2.13.x (if I am reading it right), so we would need to be able to examine it only when the version of scala supports it. Not sure the best way to do that.

@anchlovi tested this on 2.12.6 which is what we’re running with internally. I think targeting 2.12.8 and after is valid. I guess that after Christmas vacations we can take a look at upgrading to 2.12.8

I only picked 2.12.8 because that's what we're using so it was convenient to use it. 2.12.6 may well work but if you feel like upgrading anyways then that's of course fine.

ittaiz commented 4 years ago

Just so we're clear- Shachar found these issues running with 2.12.6 (what we're running with internally) but his repros were using 2.11.x (the default rules_Scala version). He'll upgrade one of our large internal repos to 2.12.10 in a side branch and see if the issue is resolved. We'll update here

anchlovi commented 4 years ago

constant folding

Same with 2.12.10

ittaiz commented 4 years ago

Jamie, Can you take a look? You said that you saw it solved with 2.12.8, no?

On Tue, 31 Dec 2019 at 17:08 Shachar Anchelovich notifications@github.com wrote:

constant folding

Same with 2.12.10 https://github.com/anchlovi/unused-deps-1/commit/8907d213c0f339c360585f0c04713ed8edca89eb

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/bazelbuild/rules_scala/issues/867?email_source=notifications&email_token=AAKQQF45WCEFT36J6SCJKNDQ3NN4DA5CNFSM4JH7D3J2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEH4KFWI#issuecomment-569942745, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKQQFZYCZ35KKSY3Q355FTQ3NN4DANCNFSM4JH7D3JQ .

--

Ittai Zeidman

40 Hanamal street, Tel Aviv, Israel

http://www.wix.com

Jamie5 commented 4 years ago

Sorry, to clarify there are two issues inspired by https://github.com/anchlovi/unused-deps-1

The first one is just that repository without any changes, and involves only constant folding. And that one needs a code fix, as found in https://github.com/Jamie5/rules_scala/commit/dd699df9d65c4768f09e9135f1f3673d599dd330 (Note: this almost certainly doesn't work with anything less than 2.12.4 as it uses some classes not introduced until then; in production we would need to figure out how to isolate this logic to >= 2.12.4. It also changes the default version to 2.12.8 for testing purposes)

The second issue was discovered when I was trying to figure out what was going on with the first issue. The second issue being that


class B2 {
    def foo(x: B3): Unit = {}
    val x = B3.A
}

still reports an unused dep error, despite the use of B3 in a non-constant folding case. In this situation, it was using 2.12.8 that fixed things (without the code fix above)

anchlovi commented 4 years ago

Thanks, I can confirm that the issue I reported was solved with dd699df

Jamie5 commented 4 years ago

@ittaiz were there also other things we needed to validate first?

ittaiz commented 4 years ago

@anchlovi will write the detailed update but the TLDR version is that he’s been trying for more than two weeks to get a significant repo in Wix to use this with no success.

Unused deps has false positives and Strict deps has false positives and negatives.

We won’t be able to use this as is and it will take us time to carve out standalone bugs and we can’t spare it right now. There’s a chance that this is better than the existing mechanism and so worthwhile for those willing to live with the shortcomings. @andyscott @long-stripe i suggest you try it out at Stripe since you guys use unused deps and let us know if this hasn’t made things for you.

@Jamie5 given we don’t hear something horrible from them this week I think we can progress given the issues we agreed on above

Jamie5 commented 4 years ago

Okay that's unfortunate but understood @ittaiz . Hopefully as you find time we can squash more issues until everything works.

Assuming no issues from @andyscott and @long-stripe would then try to make some progress. I feel like until we do quash most/all false positives/negatives we would still want the old behavior to be available, so probably there will be a new flag/rule argument to turn on new unused/strict mode.

Implementation-wise, I think the first steps would be to merge dependency_analyzer and unused_dependency_checker in a reasonable way, and probably also merge some calling code in a reasonable way. Followed by starting to actually implement.

I think the main thing I don't know which would be issues in implementing would be

There are other challenges but I think they can be figured out without overly much trouble.

anchlovi commented 4 years ago

I was trying to get it to work for the past 2 weeks.

unused deps - the tool suggested to remove dependencies even if they were needed. I ended up with a script that collected and removed all the dependencies that were reported and then tries to fix the build by adding a dependency that was removed, marking it as unused_dependency_checker_ignored_targets, checking the build status again and if it was still broken for the same reason removing it else moving on the next dependency, repeating the entire process until the entire build was fixed.

Strict deps - since we are using plus one it sometimes occurred that removing a dependency from one target broke another target that was using depended on it but used the removed dep (transitive). But the strict deps tool never suggested to add this dep

Marking deps as unused_dependency_checker_ignored_targets - in some cases the tool ignored it completely and continued to alert on dependencies that were marked to ignore. I think it might be related to the fact that we use bind to point rules_scala deps (e.g., io_bazel_rules_scala/dependency/scala/guava) to our owns

Switching back from https://github.com/Jamie5/rules_scala/commit/dd699df9d65c4768f09e9135f1f3673d599dd330 to the upstream version yielded a broken build

I think the majority of the false positives relates to the fact that in many cases the scala compiler requires more dependencies than we think, usually when there are traits and abstract classes

Jamie5 commented 4 years ago

Thanks for the update @anchlovi .

Based on my experience making it work with our codebase, the many false positives/negatives generally boil down to a handful of issues, so assuming this continues, when you do have a chance making minimal repros would help to whittle down the problems and hopefully not too many are necessary. Probably our codebases follow different styles so we only have some overlap in the issues revealed by each codebase.

If it is easy, could you explain why when switching back to upstream that it breaks or at least what type of error it is? I assume it has to do with differing definitions of what is "unused" or "strict", rather than the compile itself (which would be kind of strange).

anchlovi commented 4 years ago

@Jamie5 sure thing, I'll try to find the time to create additional repros

Jamie5 commented 4 years ago

@ittaiz given we haven't heard any strong rejections (AFAIK), is it fine to start getting started? If so do you have any thoughts on the questions in https://github.com/bazelbuild/rules_scala/issues/867#issuecomment-573839474 ? If not I will try to figure something out. (It feels like running/testing against multiple versions may relate to https://github.com/bazelbuild/rules_scala/issues/940 ?)