apache / netbeans

Apache NetBeans
https://netbeans.apache.org/
Apache License 2.0
2.64k stars 849 forks source link

Priming build and reload improvements. #6514

Closed sdedic closed 11 months ago

sdedic commented 1 year ago

There was a race condition with opening projects and subprojects with an empty .m2/repository. The issue was that when the main project was being primed, a file-based watcher recognized appearance of a dependency .jar and scheduled a reload of the project before the priming was completed.

Consider a multi-module Micronaut-based project:

[ ] example (parent = micronaut)
+--[ ] lib (parent = example)
+--[ ] app (parent = example)

Initially, when a subproject was opened, reading of the project failed as the immediate parent (root project) was locally present, but super-parent (micronaut-parent) was not resulting in properties not defined - the project read broke and was marked as a fallback, the same as the parent project. Good!

Because of the file-based trigger (micronaut-parent POM appears), the parent reloaded while other dependencies were still being downloaded. Some jar files already downloaded - and the lib project was able to resolve its dependencies to jar files. However Maven internally asks for pom-classified artifacts for the dependencies to compute (and download) a recursive closure. POMs were not available locally and were faked by NbArtifactFixer. However the information that Maven model resolver got incomplete dependency info was not recorded anywhere - those Artifact objects are used internally by Maven and do not appear in the ProjectBuildingResult.

So the basic fix has two parts:

References to (internal) MavenProjectCache replaced by NbMavenProject when possible. Detailed logging added. Used getPartialProject to potentially unwrap at least some project structure after a failed load.

I've added basic unit tests to ensure that if a subproject primes, it becomes valid. And if parent primes, the subproject also becomes valid.

mbien commented 1 year ago

just a general comment: I can't remember I ever primed anything using NetBeans (I am using the maven support since ea days). I press build and its ready (this works even if the build would fail, unless the pom is broken). I am not completely convinced that this is still a useful action, given that NB uses the local maven repo, project sources and target folder directly (I think some other IDEs have to prime since they have their own internal build representation, but this isn't the case here).

sdedic commented 1 year ago

just a general comment: I can't remember I ever primed anything using NetBeans (I am using the maven support since ea days). [...] (I think some other IDEs have to prime since they have their own internal build representation, but this isn't the case here).

You probably didn't -- since the process happens automatically, if you open a Maven project with Trust Project Build Script enabled (default: on). Priming is also usually not necessary if the referenced artifacts are already present in local repository ... which is a typical situation on developer's machine that is used to develop a number of projects.

The second thing is that headless LSP usage of NB has some initialization paths not well reachable through NetBeans UI. Our colleagues also use 'throwaway' VMs to run the NBLS servers, so some of the scenarios come from using a pristine machine without anyting in .m2 repo -- from the NB IDE point of view, they're testing for our first-time users/adopters. You would be surprised how many things go different if you just run mvn install on any project before launching NB :)

BTW - in some of the later PRs I hope I will be able to avoid running mvn install for "priming" that includes compiling a potentially broken work-in-progress project (yuck !!) and only download the necessary plugin/jar dependencies as Gradle tooling does during its initial project open. Should be a lot faster and safer. Requires an overhaul of the process with my hands inside maven's model, but (so far) seems doable.

neilcsmith-net commented 1 year ago

You probably didn't -- since the process happens automatically, if you open a Maven project with Trust Project Build Script enabled (default: on).

Sorry, where have we got automatic priming happening without user interaction? That's potentially an issue. As is treating Maven build scripts as trusted by default.

mbien commented 1 year ago

@sdedic thanks for explaining, some comments:

You probably didn't -- since the process happens automatically,

I saw a few situations where NB asked me to prime something, I am pretty sure there is a notification bubble for that, and priming happens also at some point in the "resolve problems dialog", we had even an issue filed where a user used the "reload pom" action do download dependencies (!) (#6223), which I assume is running the priming code.

Why does NB not simply ask the user to build the project using the regular maven toolchain outside of the embedder? This could cause additional problem in future, e.g: #6190

I think mvn could also run mvn dependency:go-offline or something similar?

BTW - in some of the later PRs I hope I will be able to avoid running mvn install

thats good since installing deps should not happen without user interaction IMO

(if the LSP server has special requirements - thats ok to me, esp headless mode is a completely different situation)

sdedic commented 1 year ago

[...] the "reload pom" action do download dependencies (!) (#6223), which I assume is running the priming code.

Yes.

Why does NB not simply ask the user to build the project using the regular maven toolchain outside of the embedder? This could cause additional problem in future, e.g: #6190

Now it does, it actually runs mvn install. But IMHO it's not a good idea, as

  1. we need the (internal) project load to succeed and at least resolve all referenced artifacts locally - this is done through embedder
  2. we should(!) avoid compiling the project's code

We need (1) for parsing, code analysis, location services to work. And (2) prolongs the time, mixes project dev errors (i.e. broken code) with project setup errors (i.e. nonexistent artifact) or environment setup errors (bad network setup, ...)

Majority of NB operations actually use information extracted from the Embedder, so we need the Embedder to work/be happy in the first place ... and #6190 shows our Embedder is just not configured well enough, at last for some operations.

An interesting idea would be to inject a plugin that would report all project info into the buildchain; similar to Gradle support ... but that would prevent NB to resolve queries using Maven libraries at all. If we still want to use maven libs, we IMHO need to fix the Embedder operations.

I think mvn could also run mvn dependency:go-offline or something similar?

I was experimenting with that, but I was interrupted by higher-priority tasks. Need to look this through again to rule out possible bad analysis. If I remember right, go-offline was failing on me with multimodule cases for some reason. Need to re-check as several bugs were fixed from that time, that may cause those inadequate results.

BTW - in some of the later PRs I hope I will be able to avoid running mvn install

thats good since installing deps should not happen without user interaction IMO

That install is hidden in the Priming action. It happens just when you resolve the problem from project problems or (initially) on project open. LSP case primes the projects as they are opened (similar to project open dialog). So no mvn install withou user's action.

Anyway, we need to put the artifacts to a place where our internal Embedder can find them ... and ideally to a place where the external buildchain can find them to avoid duplicate downloads.

neilcsmith-net commented 1 year ago

LSP case primes the projects as they are opened (similar to project open dialog).

But what's the default in the VSCode case? There shouldn't be automatic priming anywhere without the user opting in to that.

dbalek commented 1 year ago

LSP case primes the projects as they are opened (similar to project open dialog).

But what's the default in the VSCode case? There shouldn't be automatic priming anywhere without the user opting in to that.

In VSCode, whenever a new project is opened in workspace, user is asked to decide whether code in the project folder can be executed by VS Code and extensions without further explicit approval - see https://code.visualstudio.com/docs/editor/workspace-trust

neilcsmith-net commented 1 year ago

OK, thanks @dbalek - that's good to know they're covering this here for us. Just concerned about ensuring we don't get automatic code execution anywhere with eg. mvnw in place.

mbien commented 1 year ago

just to show that I am not crazy, the bubble does exist :) prime-bubble (happened during manual testing of a different issue, I resolved it by building the project)

sdedic commented 1 year ago

Locally squashed.

sdedic commented 1 year ago

Completely forgot about this, apologies. Two options: merge into the release "as is" & improve implementation in master as suggested by @mbien or plan to rebase this on delivery as soon as it is created. I don't think I'll make it in the remaining time with at least one re-approval.

The fixed bug is really ugly on throwaway VMs in virtual classes.

mbien commented 1 year ago

I am not sure if merging this so late is a good idea. We did something like this before and had to revert the PR since it caused side effects. There is always the next release.

also I would prefer to switch to a minimal reproducer if possible, instead of importing a full third party sample project. Since this is about priming, i suppose the java code outside of placeholder classes isn't even needed. This is just my opinion, not a veto.

MartinBalin commented 1 year ago

This PR is needed for 20 as it fixes misbehaviour of NetBeans in loading Maven project. The code in PR is tested. We will target this to delivery and then to NB20.

neilcsmith-net commented 1 year ago

Adding do not merge until ready, both branch and some issues.

I agree with @mbien about the concerns of merging late, and of the test project. The binary Maven wrapper should not be in the repository (switch to source-only wrapper) and I have concerns about the licensing of the test project files. If this is truly needed, can it be published externally for consumption?

sdedic commented 1 year ago

I have concerns about the licensing of the test project files. If this is truly needed, can it be published externally for consumption?

Re. the test project sources: these are just generated (including the LICENSE file) from https://www.graal.cloud/gcn/launcher/, which is freely available to everyone. Does it answer the licensing issue ?

neilcsmith-net commented 1 year ago

The test sources are marked (c) Oracle, so possibly need registering to be picked up in the source notice bundle (cc/ @matthiasblaesing - not sure how tests fit into that).

The Maven wrapper jar is a release blocker.

sdedic commented 1 year ago

The test sources are marked (c) Oracle, so possibly need registering to be picked up in the source notice bundle (cc/ @matthiasblaesing - not sure how tests fit into that).

OK, if that's an issue, should I create my own project from scratch just following the wizard's project structure ?

The Maven wrapper jar is a release blocker.

Will try to change to source, or get rid of wrapper. Thanks.

mbien commented 1 year ago

OK, if that's an issue, should I create my own project from scratch just following the wizard's project structure ?

for the purpose of testing the priming code, the most important part is the pom. I believe the source code can be mostly reduced to placeholder classes, the launchers are probably not needed at all most likely.

sdedic commented 1 year ago

I made an implementation overhaul of the PR, based on the feedback.

  1. implementation reworked: blocking tasks are kept in a queue. Queue's emptiness is checked upon completion of each task.
  2. I've added quite a few tests to cover different scenarios of delayed / forced / no reload requests
  3. I've recreated the test project data from scratch, with just a skeleton of project directories. None of the files should now contain copyright other than ASF or references to oracle except artifact names.

I will probably add back some basic classes in the future, and add more tests that will check diagnostics in the opened projects, i.e. no unresolved symbols in files.

I kindly ask all reviewers to re-review.

sdedic commented 1 year ago

ok, still some race condition, working on that

sdedic commented 1 year ago

temporarily removing labels so I do not run so many tests; enabling more logs. The race does not happen locally.

sdedic commented 12 months ago

@mbien -- I am surprised that Enterprise tests run in this label setup ... is it correct that labels are not checked for enterprise-test in main.yml ?

mbien commented 12 months ago

@mbien -- I am surprised that Enterprise tests run in this label setup ... is it correct that labels are not checked for enterprise-test in main.yml ?

many tests are always on, esp those who don't use up many resources. E.g the PHP label toggles only the most expensive step in the php job.

If you want me to toggle more steps/jobs with labels - let me know. (the enterprise job used to be small initially and grew over time)

sdedic commented 12 months ago

OK... some explanation before I clean up the code again: no race condition, but it turned out that the Micronaut4 tests can only run on JDK17+ - a Micronaut LifevcycleParticipant injected into Maven infrastructure in MN 4.0+ version is compiled for JDK17 and above. Then any Maven run on JDK11 fail during project model building phase, and even fails to download all artifacts -- which causes projects in NB to fail to load again, which ultimately asserts the test.

I would appreciate to run these tests (since MN4 has quite weird project structure with all the BOMs etc) - and in addition other Oracle work focuses on Micronaut ... and it's a reasonably complex application library to use as test data.

So I'd like to run build tools on both JDK11 and JDK17 as the behaviour of Gradle/Maven may differ on different JDKs.

mbien commented 12 months ago

So I'd like to run build tools on both JDK11 and JDK17 as the behaviour of Gradle/Maven may differ on different JDKs.

this is fine but will increase the whack-a-mole factor since the job is one of the more unreliable ones. It already has retry wrappers for mx project etc.

once gradle supports JDK 21 this should be moved from 17 to 21, since if it works on 21 it will on 17 too with high probability

sdedic commented 12 months ago

Especially review commit 71b6dd4a26; I've realized that althgough I've defined test.jms.flags for java.mx.projects that had been failing because of reflective access to un-opened package on JDK17, the flag was not passed to the unit test JVM. Seems that the place in nbbuild/templates/common.xml is the rigt one - but please chedk.

mbien commented 12 months ago

Seems that the place in nbbuild/templates/common.xml is the rigt one - but please chedk.

I don't know either, but if it works it is probably the right one :), I am wondering though why it worked for the other modules, since this isn't the first one which sets the flags.

Should this be set too <property name="test.jms.flags" value=""/>? So that it doesn't fail if it is not set in the project.properties? Ant isn't very consistent when it can't resolve a property, sometimes it simply pastes the unresolved string as value.

sdedic commented 12 months ago

added a default property value.

sdedic commented 12 months ago

... and finally rebased on deliver for NB20

neilcsmith-net commented 12 months ago

The PR is still based on master though. Please update the default branch if you want this in delivery.

once gradle supports JDK 21 this should be moved from 17 to 21, since if it works on 21 it will on 17 too with high probability

Well, they'll only be one more release before 17 becomes our baseline!

sdedic commented 12 months ago

@neilcsmith-net done. All tests seem to pass. Please include in NB20.

neilcsmith-net commented 12 months ago

Thanks @sdedic Any concerns still from comments above @mbien ? All being well, I will aim to merge tomorrow in time to sync for rc2.

Nitpick : there appears to be a left over chmod for mvnw in the build.xml. I assume Ant doesn't care about this given everything built?!

sdedic commented 12 months ago

Nitpick : there appears to be a left over chmod for mvnw in the build.xml. I assume Ant doesn't care about this given everything built?!

FileSet: Setup scanner in dir /space/src/vscode/netbeans/java/maven/build/test/unit/data/projects/multiproject/democa with patternSet{ includes: [mvnw] excludes: [] }
    [chmod] Skipping fileset for directory /space/src/vscode/netbeans/java/maven/build/test/unit/data/projects/multiproject/democa. It is empty.

Do you want to clean it up here, or should I include it into next commit to maven in master ?

neilcsmith-net commented 12 months ago

Don't mind! Possibly better here to fix for the release.

I did a quick search in the action output, but I guess it doesn't output that with the use of -quiet.

mbien commented 12 months ago

Thanks @sdedic Any concerns still from comments above @mbien ? All being well, I will aim to merge tomorrow in time to sync for rc2.

well, I am still worried to merge a non-trivial change like this so late (again) since it can have unexpected side effects. But since this has been tested https://github.com/apache/netbeans/pull/6514#issuecomment-1765971889 it should hopefully cause no trouble.

I wished this could have been implemented in a simpler way, without hooking into the task event system if possible, but I had unfortunately no time to provide a full alternative implementation (beside this quick experiment https://github.com/apache/netbeans/pull/6514#discussion_r1349656909).

We should try to clean up / refactor something every time we add complexity to the project otherwise this is going to get interesting. CI has already fairly low chances to complete without manual restarts, and this is after retry scripts etc.

lets get this in

neilcsmith-net commented 11 months ago

OK, thanks @mbien Merging in time for 20-rc2. Let's ensure that is well tested in this area. It is always possible to revert for 20-rc3 in case of regressions.

mbien commented 11 months ago

project properties do not open #6622