Cached projects are only restored the "target" folder but not the ~/.m2/repository folder

SchwarzAmihay commented 6 months ago

Plugin Name

@nxrocks/nx-spring-boot

Nx Report

NX Report complete - copy this into the issue template

Node : 18.17.0 OS : darwin-arm64 pnpm : 8.15.3

nx : 18.0.4 @nx/js : 18.0.4 @nx/jest : 18.0.4 @nx/linter : 18.0.4 @nx/eslint : 18.0.4 @nx/workspace : 18.0.4 @nx/cypress : 18.0.4 @nx/devkit : 18.0.4 @nx/eslint-plugin : 18.0.4 @nx/react : 18.0.4 @nx/storybook : 18.0.4 @nrwl/tao : 18.0.4 @nx/vite : 18.0.4 @nx/web : 18.0.4 typescript : 5.1.6

Community plugins: @koliveira15/nx-sonarqube : 3.5.1 @nx-tools/nx-container : 5.2.0 @nxrocks/nx-spring-boot : 9.2.2

Expected Behaviour

Cached projects should be restored to both "target" folder and the ~/.m2/repository folder

Actual Behaviour

when building the project for the first time or with --skip-nx-cache it all works fine, mvn install works, and both the target and the ~/.m2/repository folder has the artifacts.

but when the project is restored from the cache, only the "target" folder is restored making other project that depend on this one to not find the jar and fail

Steps to reproduce the behaviour

nx run back-ends-infra-infra.core:install
rm -rf ~/.m2/repository/...
rm -rf .target
nx run back-ends-infra-infra.core:install

aramfe commented 6 months ago

Agreed. This problems makes the plugin more or less unusable for NX DTE, sadly. Would appreciate help here.

tinesoft commented 6 months ago

Hi @SchwarzAmihay

Thanks for using the plugin and for reporting.

Cached projects should be restored to both "target" folder and the ~/.m2/repository folder

I disagree, the plugin does indeed cache the target folder, which make sense because that folder is inside your project, and was generated when running the build command. Caching the artifact in the external ~/.m2/repository/... folder would make no sense, as this artifact is already present in the target folder (where it is first created by Maven, before being pushed to Maven Local repository, on install)

The actual issue here, is that the install target was not re-rerun when the dependent project was restored, meaning that the artifact was not published to Maven Local repo (~/.m2/repository/...)

I'm currently working to fix that issue.

Stay tuned!

SchwarzAmihay commented 6 months ago

Thank you @tinesoft you are doing an amazing job with this plugin!

In the meantime, I will share a workaround to the problem. I created a new target named install-to-target that will build the project or bring it from the cache, and changed the install to only install the target to the ~/.m2/repository/...

"targets": {
    "install": {
      "cache": false,
      "executor": "nx:run-commands",
      "options": {
        "cwd": "{projectRoot}",
        "command": "mvn jar:jar install:install"
      },
      "dependsOn": [
        "^install",
        "install-to-target"
      ]
    },
    "install-to-target": {
      "cache": true,
      "executor": "@nxrocks/nx-spring-boot:install",
      "options": {
        "root": "{projectRoot}",
        "args": [
          "-Dmaven.test.skip.exec=true"
        ]
      },
      "dependsOn": [
        "^install-to-target"
      ],
      "outputs": [
        "{projectRoot}/target"
      ]
    },

tinesoft commented 6 months ago

Thanks for the kind words @SchwarzAmihay :-) Don't hesitate to star ⭐ the project on Github, it helps ^^

Thanks for the suggested workaround. I will try to integrate it in plugin directly.

The main takeaway here for me, is to no longer "cache" the "install" target, because otherwise, Maven/Gradle will not be invoked to perform the publish to Maven local repository, because Nx will not call it, if found in its cache...

aramfe commented 6 months ago

If the install targets are not cached, doesn't this essentially make caching useless, or am I missing something here?

I believe the main takeaway is to ensure that, even though the install target is cached, it is nonetheless copied into the local Maven repository.

This is especially necessary in the context of distributed task executions, where there is no shared .m2 directory for all the install targets, especially in the context of multi-module projects, where you want to enable distributed task execution, e.g., executing the packages' install target respectively on different machines, to maximize speed.

This way, a dependent project B package will still resolve its dependency on project A correctly by essentially applying project A target into the project B's system .m2 directory.

One important factor to not forget in all of this is that caching is basically a requirement for being able to run NX commands via DTE, as caching also functions as some kind of artifactory, to move artifacts from one task agent to another, to ensure continuity between dependent jobs. This means cache needs to be set as true, as DTE is otherwise simply not usable. There is a workaround to trick DTE agents by activating the cache, but also ensuring that never ever a cache hit occurs, even though caching is basically activated:

"my-target": {
  "inputs": [{ "runtime": "date +%s" }],
  "executor": "nx:run-commands",
  "options": {
    "commands": [
      "echo test"
    ]
  },
  "cache": true
}

Using inputs with the value [{ "runtime": "date +%s" }] ensures that continuity is within a DTE possible, without actually caching, as this make cache hits basically impossible. Not sure if this information will help for the implementation you're planning.

tinesoft commented 6 months ago

Hi @aramfe

Thank you for your input

If the install targets are not cached, doesn't this essentially make caching useless, or am I missing something here?

Nope, you are right! That's why I'm not so keen in disabling it either to be honest...

But at the same time, I don't want to mess with the global and external .m2/repository, as I think the plugin should only alter files within the boundary of the project where it is installed.

Besides, "restoring the target/ into ~/.m2/repository/.. is not as simple as copy-pasting the content of one folder to the other. In fact, Artifacts in Maven Local repo, are placed in specific subfolders (based on groupId, artifactId, version, etc), and additional metadata are generated by the mvn install command (used by @nxrocks/nx-spring-boot:install underneath).

Here is an example of a package once installed in the Maven repo:

root ➜ ~/.../com/example/bootapp/0.0.1-SNAPSHOT $ pwd
/root/.m2/repository/com/example/bootapp/0.0.1-SNAPSHOT
root ➜ ~/.../com/example/bootapp/0.0.1-SNAPSHOT $ ls -l
total 48160
drwxr-xr-x 2 root root     4096 Feb 21 23:00 ./
drwxr-xr-x 3 root root     4096 Feb 21 19:38 ../
-rw-r--r-- 1 root root 49294084 Feb 21 23:00 bootapp-0.0.1-SNAPSHOT.jar
-rw-r--r-- 1 root root     1974 Feb 21 19:37 bootapp-0.0.1-SNAPSHOT.pom
-rw-r--r-- 1 root root      704 Feb 21 23:00 maven-metadata-local.xml
-rw-r--r-- 1 root root      198 Feb 21 23:00 _remote.repositories

(Note the extra files: bootapp-0.0.1-SNAPSHOT.pom, maven-metadata-local.xml, etc)

Here is the content of the same package in local target/ folder:

root ➜ .../backend/boot-parent/bootapp/target (develop) $ pwd
/workspaces/nxrocks/backend/boot-parent/bootapp/target
root ➜ .../backend/boot-parent/bootapp/target (develop) $ ls -l
total 48484
-rw-r--r-- 1 root root 49294084 Feb 22 22:44 bootapp-0.0.1-SNAPSHOT.jar
-rw-r--r-- 1 root root     2727 Feb 22 22:44 bootapp-0.0.1-SNAPSHOT.jar.original
drwxr-xr-x 4 root root      128 Feb 22 22:44 classes
drwxr-xr-x 3 root root       96 Feb 22 22:44 generated-sources
drwxr-xr-x 3 root root       96 Feb 22 22:44 generated-test-sources
drwxr-xr-x 3 root root       96 Feb 22 22:44 maven-archiver
drwxr-xr-x 3 root root       96 Feb 22 22:44 maven-status
drwxr-xr-x 4 root root      128 Feb 22 22:44 surefire-reports
drwxr-xr-x 3 root root       96 Feb 22 22:44 test-classes

You see what I mean? not an "easy restore" 🙃

If we want to keep the install target cached, I think, a valid alternative, would be to let the CI/CD platform handle the caching of ~/.m2/repository. Most CI/CD platform like Github Actions, Circle CI, etc allow you to do that (just like you can cache node_modules)

Here is an how you could do it with Github Actions: https://docs.github.com/en/actions/using-workflows/caching-dependencies-to-speed-up-workflows

Is it something you could try ?

SchwarzAmihay commented 5 months ago

@tinesoft , I am not sure if it's relevant to this or another bug but I just noticed that the calculated cache hash does not take into consideration the OS, As I see in my Jenkins CI pipeline reference to my local development environment:

aramfe commented 5 months ago

Hey @tinesoft,

thanks for your lengthy response!

I'm not sure if I would use something like caching in GitHub Actions, as there are always possible side effects with external dependencies being cached. I'm also not a big fan of caching node_modules for example, as there is always some kind of possible side effect, one wont be able to determine at first sight. Already had my pain with this, as you might guess, haha.

I still got it running via DTE though, by simply adjusting the .m2 directory to be in the maven multi module AND specifcally also caching the output of the install artifacts in the re-located .m2 directory:

    "install": {
      "executor": "@nxrocks/nx-spring-boot:install",
      "options": {
        "args": ["-Dmaven.repo.local=.m2"],
        "root": "{workspaceRoot}/backend/packageA",
        "runFromParentModule": true
      },
      "dependsOn": ["^install"],
      "outputs": ["{workspaceRoot}/backend/packageA/target", "{workspaceRoot}/backend/.m2/my/test/package/packageA"],
      "cache": true
    },

This way it caches and restores both, the .m2 output and the target output of a package with the install target. I've not encountered any issues with that approach so far. This might be helpful for others, who want to use DTE with your plugin - (which is amazing by the way!).

aramfe commented 5 months ago

@SchwarzAmihay

You can use the the "inputs" field in your target like this (if python is installed for example):

"inputs": [{ "runtime": "python -c 'import platform; platform.system()'" }],

This way the nx hasher also takes the output of this command/the current platform into it's hash calculation.

jbadeau commented 4 months ago

Maven m2 caching could be created by resolving the path to the m2 during the createNodes function. This should prob be done by resolving the ci-friendly variables. The path could be calculated by converting the following envvars/props to file path:

${options.mavenRepoLocal}/${project.groupId}/${project.artifactId}/${revision}

mavenRepoLocal can be an passed as plugin option
project.groupId is extracted from the configFilePath/pom.xml
project.artifactId is extracted from the configFilePath/pom.xml
revision/sha1/changelist can be an envvar

tinesoft / nxrocks