apache / pekko

Build highly concurrent, distributed, and resilient message-driven applications using Java/Scala
https://pekko.apache.org/
Apache License 2.0
1.17k stars 140 forks source link

Long term: `sbt-license-report` took so much time to compile (upstream problem) #1019

Open Roiocam opened 7 months ago

Roiocam commented 7 months ago

When I tried to review PR #1012, I found the sbt-license-report took so much time, even if we have increment compile.

Long time paradox build 截屏2024-01-22 13 35 47

I think PR #704 will help the paradox compile because they skip the license report generation. On my laptop, it speeds up almost over 15x times.

disable license report made paradox build faster ![image](https://github.com/apache/incubator-pekko/assets/26020358/42c4ccbd-53a1-48b2-a792-fb1607be3223) 截屏2024-01-22 13 39 28

@mdedetrich @He-Pin Can yours pick up the PR #704, and consider the feasibility agains?

Solution

He-Pin commented 7 months ago

@Roiocam I will tooks a look after work. and we can verify it with https://github.com/apache/incubator-pekko/pull/1016

He-Pin commented 7 months ago

@Roiocam would you mind to test it with the current main? thanks. Your PR fix the huge problem for me,thanks.

mdedetrich commented 7 months ago

@Roiocam The core reason why sbt-license-report takes so long is that due to a limitation in coursier which is the default resolution mechanism for repositories in sbt, sbt license report is instead forced to re-resolve all dependencies using Ivy which is whats taking a really long time.

If you want to solve this issue (which is whats currently taking up the majority of the time in doc generation) we need to unblock this issue which involves solving https://github.com/coursier/coursier/issues/1790. I tried to do this myself but didn't have the time/capacity to figure out the core problem.

mdedetrich commented 7 months ago

@Roiocam Just to set expectations, even if you do figure out how to resolve https://github.com/coursier/coursier/issues/1790 its probably going to take a while for it to be released in coursier which then needs to be included in a new version of sbt for sbt-license-report to use

He-Pin commented 7 months ago

@mdedetrich @Roiocam We should solve this problem, but there are short-term and long-term solutions. The short-term solution is to use this PR to make everyone faster. The long-term solution is to fix the upstream issue. So I suggest adding a paradoxFast command

Roiocam commented 7 months ago

@Roiocam The core reason why sbt-license-report takes so long is that due to a limitation in coursier which is the default resolution mechanism for repositories in sbt, sbt license report is instead forced to re-resolve all dependencies using Ivy which is whats taking a really long time.

I just did a quick investigation to see why the plugin runs frequently to the issue you mentioned. The result is Paradox always executes the task of dumpLicenseReportAggregate, which depends on the task updateLicense of each submodule, which is the most time-cost part of this ISSUE.

Even if we fix the upstream problem of dependency resolver, we will still re-execute the generation of task dumpLicenseReportAggregate at each paradox command. For the document development experience, even if we optimize the 'updateLicense` to 1s (more than ten times) is still a relatively long time for the developer experience.

Considering that most of the people in this project are unpaid, I think it makes sense to have this turn-off option by default and long-lasting preservation. We should be consistent on this issue.

As for the upstream issue of this issue, I don't think pekko is concerned about it, We should spend more time on pekko itself.

mdedetrich commented 7 months ago

Even if we fix the upstream problem of dependency resolver, we will still re-execute the generation of task dumpLicenseReportAggregate at each paradox command. For the document development experience, even if we optimize the 'updateLicense` to 1s (more than ten times) is still a relatively long time for the developer experience.

While this is technically true, do note that coursier unlike Ivy has its own caching mechanism, ontop of this we would actually be re-using the value from update task which would already be evaluated in sbt. Due to this even if dumpLicenseReportAggregate is re-evaluated multiple times each evaluation would take a trivial amount of time (since we are just going to be parsing an already evaluated result from .value in memory to another data structure) so even if its done 20+ times in total it wouldn't be noticeably longer than any other sbt task execution.

That being said implementing a form of caching would be good, I think that sbt's caching may be helpful here

As for the upstream issue of this issue, I don't think pekko is concerned about it, We should spend more time on pekko itself.

Sure I am not going to be telling people where and where not to spend their time, its just that this does directly effect Pekko (as stated before) and as a nice bonus it does help the community.