nrwl / nx

Smart Monorepos · Fast CI
https://nx.dev
MIT License
23.77k stars 2.37k forks source link

brittle builds with @nrwl/workspace:run-commands in 14.5.4 #11556

Closed dereekb closed 1 year ago

dereekb commented 2 years ago

I am currently in the process of migrating to Angular 14, and updating Nx from version 14.1.8 to 14.5.4. I've been fighting with the build order and seems like there has been a regression in how it worked in 14.1.8 and 14.5.4.

I realize that targetDefaults in nx.json was added, and I have updated my project to use it. However, it seems like there are still issues that I am running into.

A common area to this seems to be the use of @nrwl/workspace:run-commands.

    "run-all-tests": {
      "executor": "@nrwl/workspace:run-commands",
      "options": {
        "commands": [
          {
            "description": "run util tests",
            "command": "npx nx test util"
          },
          {
            "description": "run util-test tests",
            "command": "npx nx test util-test"
          }
        ],
        "color": true,
        "parallel": false
      }
    },

For the above, it seems like calling run util tests tried to trigger a dependency build again, and the cache gets slipped up.

Current Behavior

My builds are inconsistently building successfully and failing randomly.

Here is it failing when running npx nx run-many --target=build ...

https://app.circleci.com/pipelines/github/dereekb/dbx-components/1537/workflows/d39816ea-52c1-4a4e-aa96-5a8cd3e24b77/jobs/4226/parallel-runs/0/steps/0-110

>  NX   Successfully ran target build-base for project firebase-server-test

   Nx read the output from the cache instead of running the command for 7 out of 10 tasks.

 >  NX   ENOENT: no such file or directory, unlink '/home/circleci/code/node_modules/.cache/nx/latestOutputsHashes/dist-packages-firebase.hash'

   Pass --verbose to see the stacktrace.

 >  NX   ERROR: Something went wrong in run-commands - Command failed: npx nx run firebase-server-test:build-base

   Pass --verbose to see the stacktrace.

It happens in different places too. The following occurs when running the run-all-test target for util.

https://app.circleci.com/pipelines/github/dereekb/dbx-components/1535/workflows/b7c1296b-51e8-4832-9c1a-94e38da43f49/jobs/4223/parallel-runs/0/steps/0-103

 >  NX   Successfully ran target test for project util

 >  NX   ENOENT: no such file or directory, scandir '/home/circleci/code/dist/packages/util/src/lib/assertion'

   Pass --verbose to see the stacktrace.

Expected Behavior

In 14.1.8 my build targets looked like this:

"build": {
      "executor": "@nrwl/workspace:run-commands",
      "options": {
        "commands": [
          {
            "command": "npx nx run util:build-base"
          },
          {
            "command": "npx nx run util-test:build-base"
          }
        ],
        "parallel": false
      }
    },
    "build-base": {
      "executor": "@nrwl/js:tsc",
      "outputs": ["{options.outputPath}"],
      "dependsOn": [
        {
          "target": "build",
          "projects": "dependencies"
        }
      ],
      "options": {
        "outputPath": "dist/packages/util",
        "tsConfig": "packages/util/tsconfig.lib.json",
        "packageJson": "packages/util/package.json",
        "main": "packages/util/src/index.ts",
        "assets": ["packages/util/*.md", "LICENSE"]
      }
    },

With the new changes I have updated them to take advantage of the depends-on feature to run build-base directly:

"build": {
      "executor": "@nrwl/workspace:run-commands",
      "outputs": ["dist/packages/util"],
      "options": {
        "commands": [
          {
            "command": "npx nx run util-test:build-base"
          }
        ],
        "parallel": false
      }
    },
    "build-base": {
      "executor": "@nrwl/js:tsc",
      "outputs": ["{options.outputPath}"],
      "options": {
        "outputPath": "dist/packages/util",
        "tsConfig": "packages/util/tsconfig.lib.json",
        "packageJson": "packages/util/package.json",
        "main": "packages/util/src/index.ts",
        "assets": ["packages/util/*.md", "LICENSE"]
      }
    },

I updated my changes to better mirror the incremental builds guide here, since this is what I ended up using the last time I had troubles:

https://github.com/leosvelperez/nx/blob/master/docs/shared/guides/setup-incremental-builds-angular.md

Note, I am not using incremental builds, just taking the structure used for build/build-base.

Here is my targetDefaults configuration for nx.json:

  "targetDefaults": {
    "build": {
      "dependsOn": ["build-base"]
    },
    "build-base": {
      "dependsOn": ["^build-base"]
    },
    "run-all-tests": {
      "dependsOn": ["build"]
    },
    "publish": {
      "dependsOn": ["build"]
    },
    "publish-npmjs": {
      "dependsOn": ["build"]
    },
    "test": {
      "dependsOn": ["build"]
    },
    "deploy": {
      "dependsOn": ["build"]
    },
    "ci-deploy": {
      "dependsOn": ["build"]
    }
  },

In 14.1.8 I could call the run commands in parallel and it would be fine, but in 14.5.4 even when running nothing in parallel they sometimes have trouble building without encountering an error.

So far I've been attempting to reduce the amount of areas it could potentially slip up at. I'm not sure if I'm missing a new piece of configuration or if it is a bug.

Previously my CI step was to just build them all at once:

# build content
      - run: npx nx affected --base=$NX_BASE --target=build --parallel --max-parallel=2

But now I feel like I'm having to build them sequentially in an effort to not have the cache hit an error.

# run build-base for each project
      - run: npx nx affected --base=$NX_BASE --target=build-base --parallel --max-parallel=2
      # run build for the buildable projects
      - run: npx nx run-many --target=build --projects=util,nestjs,dbx-core,dbx-analytics,dbx-web,dbx-form,firebase --parallel=false
      - run: npx nx run-many --target=build --projects=firebase-server,dbx-firebase --parallel=false

Steps to Reproduce

One build failure was reproduced on this new example workspace:

https://github.com/dereekb/nx-build-test

https://nx.app/runs/ev7nFAiqU93

npx nx affected --all --target=build --parallel --max-parallel=3 --skip-nx-cache

Screen Shot 2022-08-13 at 12 35 53 AM

I was able to encounter it again after about 7 runs.

Failure Logs

Most of the build errors are visible here and are a result of the build issue described above:

https://app.circleci.com/pipelines/github/dereekb/dbx-components?branch=feat%2Fangular14

Environment

   Node : 16.15.1
   OS   : darwin arm64
   npm  : 8.11.0

   nx : 14.5.4
   @nrwl/angular : 14.5.4
   @nrwl/cypress : 14.5.4
   @nrwl/detox : Not Found
   @nrwl/devkit : 14.5.4
   @nrwl/eslint-plugin-nx : 14.5.4
   @nrwl/express : Not Found
   @nrwl/jest : 14.5.4
   @nrwl/js : 14.5.4
   @nrwl/linter : 14.5.4
   @nrwl/nest : 14.5.4
   @nrwl/next : Not Found
   @nrwl/node : 14.5.4
   @nrwl/nx-cloud : 14.3.0
   @nrwl/nx-plugin : Not Found
   @nrwl/react : Not Found
   @nrwl/react-native : Not Found
   @nrwl/schematics : Not Found
   @nrwl/storybook : 14.5.4
   @nrwl/web : 14.5.4
   @nrwl/workspace : 14.5.4
   typescript : 4.7.4
   ---------------------------------------
   Community plugins:
     @ngrx/component-store: 14.0.2
     @ngrx/data: 14.0.2
     @ngrx/effects: 14.0.2
     @ngrx/entity: 14.0.2
     @ngrx/store: 14.0.2
     @ngx-formly/schematics: 6.0.0-rc.2
     angular-calendar: 0.30.0
     @jscutlery/semver: 2.26.0
     @ngrx/store-devtools: 14.0.2
dereekb commented 2 years ago

Here's an example of it behaving inconsistently after running nx build-base 4 times in a row. I was running nx test dbx-form --skip-nx-cache in a separate tab while testing something else:

dereekb@dbMBP dbcomponents % nx build-base

   ✖    1/12 dependent project tasks failed (see below)
   ✔    11/12 dependent project tasks succeeded [11 read from cache]

 ———————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————

> nx run dbx-firebase:build-base:production

Building Angular Package

------------------------------------------------------------------------------
Building entry point '@dereekb/dbx-firebase'
------------------------------------------------------------------------------
✖ Compiling with Angular sources in Ivy partial compilation mode.

 >  NX   Cannot resolve type entity i5.DbxCoreActionModule to symbol

   Pass --verbose to see the stacktrace.

 ———————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————

 >  NX   Ran target build-base for project demo and 13 task(s) it depends on (3s)

    ✖    1/12 failed
    ✔    11/12 succeeded [11 read from cache]

   See Nx Cloud run details at https://nx.app/runs/CyPgyiKrVpH
dereekb@dbMBP dbcomponents % nx build-base

   ✖    1/12 dependent project tasks failed (see below)
   ✔    11/12 dependent project tasks succeeded [11 read from cache]

 ———————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————

> nx run dbx-firebase:build-base:production

It looks like all of dbx-firebase's dependencies have not been built yet:
- dbx-web

You might be missing a "targetDefaults" configuration in your root nx.json (https://nx.dev/configuration/projectjson#target-defaults),
or "dependsOn" configured in dbx-firebase's project.json (https://nx.dev/configuration/projectjson#dependson)

 ———————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————

 >  NX   Ran target build-base for project demo and 13 task(s) it depends on (850ms)

    ✖    1/12 failed
    ✔    11/12 succeeded [11 read from cache]

   See Nx Cloud run details at https://nx.app/runs/tZuAa5O0leV
dereekb@dbMBP dbcomponents % nx build-base

   ✔    13/13 dependent project tasks succeeded [11 read from cache]

   Hint: you can run the command with --verbose to see the full dependent project outputs

 ———————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————

> nx run demo:build-base:production

✔ Browser application bundle generation complete.
✔ Copying assets complete.
✔ Index html generation complete.

Initial Chunk Files           | Names                                      |  Raw Size | Estimated Transfer Size
main.70ecb184ac3420d7.js      | main                                       |   2.97 MB |               715.09 kB
styles.40cf6eacb7bfce8f.css   | styles                                     | 193.84 kB |                16.39 kB
polyfills.4b39cc712ccdab05.js | polyfills                                  |  33.07 kB |                10.66 kB
runtime.18172291aafd8ca2.js   | runtime                                    |   3.17 kB |                 1.53 kB

                              | Initial Total                              |   3.20 MB |               743.68 kB

Lazy Chunk Files              | Names                                      |  Raw Size | Estimated Transfer Size
593.aa6191625003639e.js       | modules-demo-demo-module                   |  75.39 kB |                 8.10 kB
14.22386079e283c50a.js        | modules-layout-doc-layout-module           |  36.97 kB |                 7.25 kB
761.c7d9c2115e2bd7cc.js       | modules-form-doc-form-module               |  32.64 kB |                 6.95 kB
683.7e350bdbcd648320.js       | modules-action-doc-action-module           |  32.38 kB |                 7.01 kB
[92m991.291234c71597e01b.js       | modules-interaction-doc-interaction-module |  24.35 kB |                 4.98 kB
54.7bedc36146de3f87.js        | modules-landing-landing-module             |   9.29 kB |                 2.94 kB
278.44c94f6c9fc75aa7.js       | modules-router-doc-router-module           |   8.54 kB |                 2.18 kB
472.710bc900b9b516fa.js       | modules-doc-doc-module                     |   8.08 kB |                 2.29 kB
617.0807b5464f48281c.js       | modules-guestbook-guestbook-module         |   7.47 kB |                 2.35 kB
622.d7e6402776c13d52.js       | modules-extension-doc-extension-module     |   7.29 kB |                 2.34 kB
897.f14028ba996ea0bb.js       | modules-auth-doc-auth-module               |   7.18 kB |                 1.92 kB
common.a205fa573cdfdb2c.js    | common                                     |   6.10 kB |                 1.95 kB
300.c29364809900c38c.js       | modules-text-doc-text-module               |   5.07 kB |                 1.47 kB
142.1764436593e17cab.js       | modules-auth-demo-auth-module              |   4.19 kB |                 1.22 kB
690.d5b3ad8bf1183e1e.js       | modules-profile-profile-module             |   3.77 kB |                 1.34 kB
736.409c0e3c6ef86b0d.js       | modules-onboard-demo-onboard-module        |   2.48 kB |               988 bytes
631.bf03f4f399cfbdfb.js       | modules-app-demo-app-module                |   1.88 kB |               809 bytes

Build at: 2022-08-12T22:11:31.700Z - Hash: 63c090ea985cce43 - Time: 17569ms

Warning: /Users/dereekb/development/git/dbcomponents/apps/demo/src/app/modules/demo/demo.scss exceeded maximum budget. Budget 2.00 kB was not met by 69.34 kB with a total of 71.34 kB.

———————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————

 >  NX   Successfully ran target build-base for project demo and 13 task(s) it depends on (38s)

   Nx read the output from the cache instead of running the command for 11 out of 14 tasks.

   See Nx Cloud run details at https://nx.app/runs/DpAS59v9wqK

dereekb@dbMBP dbcomponents % 

I also have it failing regularly with the following:

dereekb@dbMBP dbcomponents % npx nx affected --all --target=build --parallel --max-parallel=3          

 >  NX   Running affected:* commands with --all can result in very slow builds.

   --all is not meant to be used for any sizable project or to be used in CI.

   Learn more about checking only what is affected: https://nx.dev/cli/affected

    ✔  nx run util:build-base  [existing outputs match the cache, left as is]
    ✔  nx run rxjs:build-base  [existing outputs match the cache, left as is]
    ✔  nx run model:build-base  [existing outputs match the cache, left as is]
    ✔  nx run date:build-base  [existing outputs match the cache, left as is]
    ✔  nx run util-test:build-base  [local cache]
    ✔  nx run nestjs:build-base  [local cache]
    ✔  nx run browser:build-base  [existing outputs match the cache, left as is]
    ✔  nx run model:build (785ms)
    ✔  nx run rxjs:build (1s)
    ✔  nx run firebase:build-base  [existing outputs match the cache, left as is]
    ✔  nx run dbx-core:build-base:production  [existing outputs match the cache, left as is]
    ✔  nx run util-test:build (356ms)
    ✔  nx run date:build (766ms)
    ✔  nx run nestjs-stripe:build-base  [local cache]
    ✔  nx run browser:build (559ms)
    ✔  nx run util:build (3s)
    ✔  nx run firebase-test:build-base  [local cache]
    ✔  nx run firebase-server:build-base  [local cache]
    ✔  nx run demo-firebase:build-base  [existing outputs match the cache, left as is]
    ✔  nx run nestjs:build (5s)
    ✔  nx run dbx-web:build-base:production  [existing outputs match the cache, left as is]
    ✔  nx run dbx-analytics:build-base:production  [existing outputs match the cache, left as is]
    ✔  nx run nestjs-stripe:build (849ms)
    ✔  nx run dbx-core:build (2s)
    ✔  nx run firebase-test:build (730ms)
    ✔  nx run firebase-server-test:build-base  [local cache]
    ✔  nx run demo-firebase:build (518ms)
    ✔  nx run firebase:build (8s)
    ✔  nx run dbx-form:build-base:production  [existing outputs match the cache, left as is]

    ✖  nx run firebase-server:build
       >  NX   Running target build-base for project firebase-server-test and 9 task(s) it depends on

        >  NX   Successfully ran target build-base for project firebase-server-test

          Nx read the output from the cache instead of running the command for 7 out of 10 tasks.

        >  NX   ENOENT: no such file or directory, unlink '/Users/dereekb/development/git/dbcomponents/node_modules/.cache/nx/latestOutputsHashes/dist-packages-firebase.hash'

          Pass --verbose to see the stacktrace.

        >  NX   ERROR: Something went wrong in run-commands - Command failed: npx nx run firebase-server-test:build-base

          Pass --verbose to see the stacktrace.

    ✔  nx run dbx-analytics:build (1s)
    ✔  nx run demo-api:build-base  [local cache]
    ✔  nx run firebase-server-test:build (679ms)
    ✔  nx run dbx-firebase:build-base:production  [existing outputs match the cache, left as is]
    ✔  nx run demo-api:build (1s)
    ✔  nx run dbx-web:build (5s)
    ✔  nx run demo-components:build-base:production  [existing outputs match the cache, left as is]
    ✔  nx run dbx-form:build (3s)
    ✔  nx run demo:build-base:production  [local cache]
    ✔  nx run dbx-firebase:build (2s)
    ✔  nx run demo-components:build (945ms)
    ✔  nx run demo:build (1s)

 —————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————

 >  NX   Ran target build for 21 projects and 21 task(s) they depend on (16s)

    ✔    41/42 succeeded [21 read from cache]

    ✖    1/42 targets failed, including the following:
         - nx run firebase-server:build

   See Nx Cloud run details at https://nx.app/runs/iAKpDMegcx7

To get around this I need to run it by itself:

dereekb@dbMBP dbcomponents % nx run firebase-server:build

   ✔    6/6 dependent project tasks succeeded [6 read from cache]

   Hint: you can run the command with --verbose to see the full dependent project outputs

 —————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————

> nx run firebase-server:build-base  [local cache]

Compiling TypeScript files for project "firebase-server"...
Done compiling TypeScript files for project "firebase-server".

> nx run firebase-server:build

 >  NX   Running target build-base for project firebase-server-test and 9 task(s) it depends on

> nx run firebase-server-test:build-base  [local cache]

Compiling TypeScript files for project "firebase-server-test"...
Done compiling TypeScript files for project "firebase-server-test".

 >  NX   Successfully ran target build-base for project firebase-server-test

   Nx read the output from the cache instead of running the command for 10 out of 10 tasks.

 —————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————

 >  NX   Successfully ran target build for project firebase-server and 7 task(s) it depends on (5s)

   Nx read the output from the cache instead of running the command for 7 out of 8 tasks.

   See Nx Cloud run details at https://nx.app/runs/0G5SDuPJGEe

I was able to reproduce this on a basic/new project as well: https://github.com/dereekb/nx-build-test

dereekb commented 2 years ago

After playing with it for a while I decided the best thing to do was change how things get built.

Building using "build-base" seems to be ok to build in parallel since it all remains in a single context; that is, @nrwl/workspace:run-commands doesn't call run that is in a new context and starts trying to call "build-base" again when I call "build".

I think that's one piece of the puzzle is that the new changes don't seem to handle parallel contexts or locking very well, whereas the prior one I was using in 14.1.8 didn't seem to care and I could use @nrwl/workspace:run-commands with no regard for the cache or builds.

Changes:

  1. I've updated my project.json files to only include build if they have build steps, and left build-base as-is.
  2. I changed my targetDefaults to no longer have build depend on build of children. Since the result of build is really only used in npm this is fine.
  "targetDefaults": {
    "build": {
      "dependsOn": ["build-base"]
    },
    "build-base": {
      "dependsOn": ["^build-base"]
    },
  1. I added a build-all to workspace.json:
    "build-all": {
      "executor": "@nrwl/workspace:run-commands",
      "options": {
        "commands": [
          {
            "description": "build the base of all projects seqentially.",
            "command": "npx nx run-many --target=build-base --parallel --max-parallel=3"
          },
          {
            "description": "build each of the projects that need to be built in a sequential order",
            "command": "npx nx run-many --target=build --parallel=false"
          }
        ],
        "color": true,
        "parallel": false
      }
    },

So my build-base all build in parallel and get cached, and then build runs sequentially.

This ALMOST worked, and it didn't because of what l think there is some sort of cache error.

Screen Shot 2022-08-13 at 1 38 04 AM

For reference, here's firebase-server's build function:

    "build": {
      "executor": "@nrwl/workspace:run-commands",
      "outputs": ["dist/packages/firebase-server"],
      "options": {
        "commands": [
          {
            "command": "npx nx run firebase-server-test:build-base"
          }
        ],
        "parallel": false
      }
    },

It seems like when npx nx run firebase-server-test:build-base is called that firebase's cached build isn't available or it tries to build it again. You all might have more insight on what's going on behind the scenes. Another possibility is right after firebase-server:build finishes it kicks off firebase-server-test:build-base as a part of run-many and the contexts don't have a good time with that.

I tried to add firebase as implicit exclusion in firebase-server-test's project.json file and now it seems to be fine.

Now I can run npx nx run-many --target=build --parallel=false --skip-nx-cache and my project builds every time without issue.

Edit:

I dug a bit more and realized something else that is happening. Each time build is called it calls the build-base, which removes the dist folder for that, so if I call build for util and it goes and performs its side-effects, and then another dependency invokes build-base for util, the dist folder for util will get removed and replaced with the result from build-base.

I'm not sure what the best way to maintain side effects that get cached is, now. I have a feeling I'm running in circles here.

My next thought is to have everything depend on build instead of build-base but I feel like I'm going to end up right back where I started.

dereekb commented 2 years ago

Ok, I figured it out. The above thought worked with some minor caveats that aren't an issue.

Trick was to make everything use build and work off that. I did run into an issue where I had some builds form an infinite loop but was able to ignore that by adding "dependsOn": [] to prevent it from calling the parent build which called the child build-base, which required build, etc.

I'm going to leave this open. I'll say I feel like the incremental-build guide I was following was misleading, and I don't think I'd recommend it done like that. After my journey it doesn't seem like using build-base as the targetDefaults works well, (if at all), with a larger project.

https://github.com/leosvelperez/nx/blob/master/docs/shared/guides/setup-incremental-builds-angular.md

I guess the repo it references hasn't even been updated in a long time and doesn't even use the new system.

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it hasn't had any recent activity. It will be closed in 14 days if no further activity occurs. If we missed this issue please reply to keep it active. Thanks for being a part of the Nx community! 🙏

github-actions[bot] commented 1 year ago

This issue has been closed for more than 30 days. If this issue is still occuring, please open a new issue with more recent context.