broccolijs / broccoli

Browser compilation library – an asset pipeline for applications that run in the browser
https://broccoli.build
MIT License
3.33k stars 217 forks source link

ensure `builder.cleanup()` waits on pending work #393

Closed stefanpenner closed 5 years ago

stefanpenner commented 5 years ago

Prior to this, if cleanup was invoked while the build was cancelling, it would begin to do cleanup prior to the current step completing. Which would typically result in "missing directory" or "missing file" errors, which would then result in the build crashing and not completing it cleanup rather then exiting cleanly.

stefanpenner commented 5 years ago

@oligriffiths this is likely the solution to the problem we spoke about on Wednesday. I finally had some time to debug it...

oligriffiths commented 5 years ago

Looks sane. Fix tests and we’re good I think.

stefanpenner commented 5 years ago

@oligriffiths aggregateTime seems to OOM when we don't crash during mid-build cancellation. I'm investigating why, and see if I can fix...

stefanpenner commented 5 years ago

Although this is green, I haven't been able to fix the OOM issue (or add a test here that demonstrates it) So we should hold-off until i, or someone else, has a chance to look into that.

oligriffiths commented 5 years ago

@stefanpenner got a repo to demo the OOM?

oligriffiths commented 5 years ago

Ok so I reproduced the issue: (ember build, wait for 'building...' then cmd+c)

ember build
Environment: development
cleaning up...
cleaning......
Cannot stop: not the current node.

<--- Last few GCs --->

[68590:0x102801000]    13909 ms: Mark-sweep 998.5 (1045.1) -> 828.5 (875.1) MB, 450.2 / 0.0 ms  allocation failure GC in old space requested
[68590:0x102801000]    14610 ms: Mark-sweep 828.5 (875.1) -> 828.4 (838.6) MB, 701.4 / 0.0 ms  last resort GC in old space requested
[68590:0x102801000]    15083 ms: Mark-sweep 828.4 (838.6) -> 828.4 (838.1) MB, 472.2 / 0.0 ms  last resort GC in old space requested

<--- JS stacktrace --->

==== JS stack trace =========================================

    0: ExitFrame [pc: 0x38b802e0427d]
Security context: 0x15b0e6c206a9 <JSObject>
    1: aggregateTime(aka aggregateTime) [/Users/goli/Projects/broccoli/lib/builder.js:~448] [pc=0x38b803389be9](this=0x15b0095022e1 <undefined>)
    2: buildHeimdallTree [/Users/goli/Projects/broccoli/lib/builder.js:400] [bytecode=0x15b0a04faca9 offset=21](this=0x15b0b9f13439 <EventEmitter map = 0x15b02a42ed51>,outputNodeWrapper=0x15b05fa22971 <NodeWrapper m...

FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory

Note the Cannot stop: not the current node. line, it's important.

During cleaning, heimdall is being messed with as it is being started/stopped in the wrong order. What happens is a node starts being build, and heimdall.start() is called from beginNode fire in Broccoli. Then cleanup() is called and https://github.com/ember-cli/ember-cli/blob/master/lib/models/builder.js#L250 starts a new heimdall node, and then the endNode is fired for the currently building node, then the cleanup heimdall node is stopped.

So for now, I've added a _isCleaning flag to skip doing the heimdall doing the aggregate time.

➜  ember build
Environment: development
cleaning up...
cleaning......
Cannot stop: not the current node.
Built project successfully. Stored in "dist/".

So the same Cannot stop: not the current node. is output (issue still exists) but the OOM is gone.

oligriffiths commented 5 years ago

So, I added some debug statements to beginNode and endNode. Even when cancelling the ember-cli build, the broccoli build doesn't get cancelled. Example: ember-cli-build.js

'use strict';

const Plugin = require('broccoli-plugin');

class Sleep extends Plugin {
  build() {
    return new Promise(resolve => {
      console.log('Going to sleep');
      setTimeout(() => {
        console.log('Waking up');
        resolve();
      }, 2000);
    });
  }
}

module.exports = function(defaults) {
  return new Sleep([
    new Sleep([], { name: 'Sleep 1' })
  ], { name: 'Sleep 2' });
};
➜  ember-test git:(master) ✗ ember build
Environment: development
START Sleep 1
Going to sleep
cleaning up...
Waking up
DONE Sleep 1
Cannot stop: not the current node.
START Sleep 2
Going to sleep
Waking up
DONE Sleep 2
Built project successfully. Stored in "dist/".

CMD+C was initiated after the first Going to sleep, see cleaning up.... It completes after, but then the next node continues to build Sleep 2, this doesn't seem right...

stefanpenner commented 5 years ago

@oligriffiths ya, looks like something else is fishy...

oligriffiths commented 5 years ago

@stefanpenner yeah so broccoli-builder did call cancel() https://github.com/ember-cli/broccoli-builder/blob/0-18-x/lib/builder.js#L212

However Broccoli 2 does not, and if you do call cancel, you get an exception due to https://github.com/broccolijs/broccoli/blob/master/lib/builder.js#L109 and https://github.com/broccolijs/broccoli/blob/master/lib/builder.js#L119

I don't know what the intention behind those 2 lines is, can you shed some light? Seems like we need to do some refactoring.

Imo, the flow should be

stefanpenner commented 5 years ago

@oligriffiths I've fixed it

stefanpenner commented 5 years ago

@oligriffiths updated, mind re-reviewing?

oligriffiths commented 5 years ago

@stefanpenner something is still off. Trying with a new ember install, hitting CMD+C after building... is printed to console:

Environment: development
cleaning up...
Cannot stop: not the current node.
BUILD CANCELLED

Stack Trace and Error Report: /var/folders/v0/v1xr03j523d3py2sp5m4_gsm0000gn/T/error.dump.210e83fb537b3243555bd0c7f3062c40.log

// log
ERROR Summary:

  - broccoliBuilderErrorStack: [undefined]
  - code: [undefined]
  - codeFrame: [undefined]
  - errorMessage: BUILD CANCELLED
  - errorType: [undefined]
  - location:
    - column: [undefined]
    - file: [undefined]
    - line: [undefined]
  - message: BUILD CANCELLED
  - name: Cancelation
  - nodeAnnotation: [undefined]
  - nodeName: [undefined]
  - originalErrorMessage: [undefined]
  - stack: Cancelation: BUILD CANCELLED
    at CancelationRequest.throwIfRequested (/Users/goli/Projects/broccoli/lib/cancelation-request.js:17:13)
    at promiseFinally.then (/Users/goli/Projects/broccoli/lib/builder.js:115:48)
    at process._tickCallback (internal/process/next_tick.js:68:7)

Cannot stop: not the current node. means we have unbalanced heimdall nodes.

stefanpenner commented 5 years ago

@oligriffiths ya we explicitly fail the build, as the build was not successful so the "BUILD CANCELLED" is expected. That being said, i cannot replicate the Cannot stop: not the current node message myself.

oligriffiths commented 5 years ago

Adding a log to startNode and endNode you get this:

➜  ember-test git:(master) ember build
Environment: development
Start: EMPTY_MERGE_TREE
End: BroccoliMergeTrees
Start: ProcessedTemplateTree
End: Funnel
Start: /Users/goli/Projects/ember-test/app/styles
End: UnwatchedDir
Start: Funnel (styles)
End: Funnel
Start: /Users/goli/Projects/ember-test/tests
End: WatchedDir
Start: broccoli-persistent-filter:EslintValidationFilter
cleaning up...
End: broccoli-persistent-filter:EslintValidationFilter
Cannot stop: not the current node.
BUILD CANCELLED
stefanpenner commented 5 years ago

@oligriffiths we should likely make BUILD CANCELLED less screaming case.

stefanpenner commented 5 years ago

@oligriffiths are you sure the heimdall warning is from broccoli, and not something in ember-cli?

stefanpenner commented 5 years ago

kill logging: https://github.com/heimdalljs/heimdalljs-lib/commit/baac3fe53972afccab18aa1643f65dd56161ef22

* 43d1ac2 - (HEAD -> v0.2.x, tag: v0.2.6) release v0.2.6 🎉 (7 months ago) <Stefan Penner>
* 28cbde3 - (origin/v2-kill-bad-logging) remove non-actionable logging (7 months ago) <Stefan Penner>
stefanpenner commented 5 years ago

released as v2.2.0 🎉

oligriffiths commented 5 years ago

👏