Somewhat slow performance processing reports; 33% of time spent in findOriginalRange, etc

jakebailey commented 2 months ago

Collating coverage for TypeScript takes some 35 seconds on my machine. I know it's a lot of data so I'm not expecting instant, but I was curious and ran https://www.npmjs.com/package/pprof-it on c8 report to see if there was any low hanging fruit:

$ npm test -- --no-lint --coverage
$ NODE_V8_COVERAGE=$PWD/coverage/tmp pprof-it ./node_modules/c8/bin/c8.js report --experimental-monocart

This spits out a pprof file, which in a flame graph looks like this:

33% of the time is spent in findOriginalRange (rather alignText). If I add a cache in front of alignText, that portion of the runtime is cut in half (about 7s), bringing the overall run to 28 seconds.

I'm not super familiar with this codebase, but was hoping that you may be interested in using some perf tooling here to try and make this even faster!

jakebailey commented 2 months ago

FWIW regular c8 on the same info takes about 11 seconds, so it's not a small overhead (obviously monocart is doing more in a good way too).

cenfun commented 2 months ago

Thanks for the feedback. Indeed, 35s is a bit slow, seems that the performance needs to be optimized. Monocart takes more time than regular c8 to parse the AST of js files. I test it in my local PC, it takes about 15s-20s npx c8 report --experimental-monocart --logging=debug

[MCR] - parsed ast: built/local/run.js (12.2 MB) (2.1s)
[MCR] source filter (built/local/run.js): before 585 => after 585
[MCR] added source files: 585
[MCR] - unpacked dist: built/local/run.js (585 files) (3.5s)
[MCR] - generated data list (586 files) (2.4s)
[MCR] - saved coverage reports (5.4s)
[MCR] generated coverage reports: coverage/index.html (14.7s)

Windows 10
Nodejs 20.15
CPU 13th Gen Intel(R) Core(TM) i5-13600KF

I thought it mainly depends on the speed of the CPU (single thread). Let me figure out how to optimize it.

jakebailey commented 2 months ago

Indeed; when I debugged this, it just looked like the alignText case was interesting just because it repeated the same calls quite a bit.

cenfun commented 2 months ago

I used cache for alignText, but it didn't seem to save much time. see https://github.com/cenfun/monocart-coverage-reports/commit/741db12f262f35250fa5b9f2b9a4e5d553337aa8

[MCR] generated coverage reports: coverage/index.html (14.2s)

Please let me know if you have any new suggestions and finding.

jakebailey commented 2 months ago

I'm not sure that adding two strings together like that is a good cache key (but I'm not sure what they contain), since two differing inputs could concat together into the same string.

My change made a nested Map<string, Map<string, ...>> and then used the two individually as keys.

Hard for me to reason about what's going out without types or more descriptive variable names, unfortunately...

cenfun commented 2 months ago

The purpose of this is to correct the original code position when the sourcemaps is missing. for example: alignText(gt, ot)

gt: generatedText: '"false";\n',
ot: originalText: "'false';\r\n ",

if generatedPos is 7 (provide by v8), we need to find out the originalPos. it should be 7 too here, because the only changes is single quotes to double quotes. but it is not such simple for others, for example

    // generatedText: 'e),',
    // generatedPos: 2,
    // originalText: 'prop))'

    // generatedText: ' = false)',
    // generatedPos: 8,
    // originalText: '=false"'

Currently, Im using diff-sequences to help finding the right original position, It doesn't work for all cases, but it does for most.

cenfun commented 2 months ago

The cache already used Map<string, Map<string, ...>>, and I have tried to do some performance optimization in other ways, but still haven't good results yet. Just added some time logs in the branch performance-issue-47

I wrote a script to test it quickly:

// mcr.js

// node mcr.js
// npx c8 report --experimental-monocart --logging=debug
// npx pprof-it mcr.js

// git clone and check branch `performance-issue-47` 
// npm i && npm run build
const MCR = require("../../workspace/monocart-coverage-reports");

const main = async () => {
    const time_start = Date.now();

    const mcr = MCR({
        logging: "debug",

        // remove previous cache data
        cleanCache: true,

        name: 'My Coverage Report',
        outputDir: "coverage-reports",
        reports: ["lcovonly", "cobertura", "v8", "codecov"],

        entryFilter: {
            "**/node_modules/**": false,
            "**/built/local/**": true,
            "**/src/**": true
        },
    })

    // from raw v8 coverage data
    await mcr.addFromDir("./coverage/tmp/");
    await mcr.generate();

    const duration = (Date.now() - time_start) / 1000;
    console.log(`generated reports`, duration + "s");
}

main();

[MCR] added from dir: ./coverage/tmp/ (3.2s)
[MCR] ┌ [generate] prepared coverage data (714ms)
[MCR] ┌ [generate] merged v8 coverage data (302ms)
[MCR]   ┌ [convert] parsed ast: built/local/run.js (12.2 MB) (2.1s)
[MCR] source filter (built/local/run.js): before 585 => after 585
[MCR] added source files: 585
[MCR]   ┌ [convert] unpacked dist: built/local/run.js (585 files) (2.9s)
[MCR]   ┌ [convert] converted data list (586 files) (2.7s)
[MCR] ┌ [generate] converted coverage data (8.1s)
[MCR] ┌ [generate] saved report: lcovonly (1.7s)
[MCR] ┌ [generate] saved report: cobertura (1.9s)
[MCR] ┌ [generate] saved report: v8 (1.8s)
[MCR] ┌ [generate] saved report: codecov (40ms)
[MCR] generated coverage reports: coverage-reports/index.html (14.6s)
generated reports 17.759s

cenfun commented 2 months ago

Please try latest version monocart-coverage-reports@2.8.6, I thought there could be 10%-30% performance improvement.

using my local CPU (process.env.MCR_LOG_TIME = true)

[MCR] added from dir: ./coverage/tmp/ (3.1s)   
[MCR] ┌ [generate] prepared coverage data (4ms)
[MCR] ┌ [generate] merged v8 coverage data (303ms)
[MCR]   ┌ [convert] parsed ast: built/local/run.js (12.2 MB) (2.5s)
[MCR]   ┌ [convert] unpacked sourcemap: built/local/run.js (585 files) (2.5s)
[MCR]   ┌ [convert] converted data list (585 files) (1.1s)
[MCR] ┌ [generate] converted coverage data (6.1s)
[MCR] ┌ [generate] saved report: lcovonly (708ms)
[MCR] ┌ [generate] saved report: cobertura (825ms)
[MCR] ┌ [generate] saved report: v8 (694ms)
[MCR] ┌ [generate] saved report: codecov (34ms)
[MCR] generated coverage reports: coverage-reports/index.html (8.7s)
generated reports 11.807s - previous is 17s

using github action with default runner https://github.com/cenfun/mcr-test/actions/runs/9729404431/job/26851249849 CPU: AMD EPYC 7763 64-Core Processor 4 threads

[MCR] added from dir: /home/runner/work/mcr-test/mcr-test/coverage/tmp (1.3s)
[MCR] ┌ [generate] prepared coverage data (1ms)
[MCR] ┌ [generate] merged v8 coverage data (250ms)
[MCR]   ┌ [convert] parsed ast: built/local/run.js (12.2 MB) (6.9s)
[MCR]   ┌ [convert] unpacked sourcemap: built/local/run.js (585 files) (5.5s)
[MCR]   ┌ [convert] converted data list (585 files) (1.9s)
[MCR] ┌ [generate] converted coverage data (14.2s)
[MCR] ┌ [generate] saved report: lcovonly (1.1s)
[MCR] ┌ [generate] saved report: cobertura (1.3s)
[MCR] ┌ [generate] saved report: v8 (1s)
[MCR] ┌ [generate] saved report: codecov (30ms)
[MCR] generated coverage reports: coverage/index.html (17.9s)  - previous is 26s

jakebailey commented 2 months ago

Things do seem to be improved; down to the ~27 seconds I got before, nice!

The next area of interest seems to be the comment parsing:

Some 20% here. The top functions by self time:

pprof-heap-575234.pb.gz pprof-time-575234.pb.gz

(I will be out of town but do hope to look at some of this if you don't beat me to it)

cenfun commented 2 months ago

Yes, I created a separate utility tool monocart-locator which is including parsing blank lines and comment lines. but i don't think it has much potential for improving performance.

source files 585, duration 498 ms
┌─────────┬────────────────────────────────────────────────────┬──────────┬───────┐
│ size*   │ source                                             │ comments │ time  │
├─────────┼────────────────────────────────────────────────────┼──────────┼───────┤
│ 3031387 │ checker.ts                                         │ 6,388    │ 87 ms │
│ 536753  │ parser.ts                                          │ 1,685    │ 15 ms │
│ 517578  │ diagnosticInformationMap.generated.ts              │ 26       │ 7 ms  │
│ 481679  │ utilities.ts                                       │ 597      │ 8 ms  │
│ 467980  │ types.ts                                           │ 2,435    │ 16 ms │
│ 333866  │ nodeFactory.ts                                     │ 922      │ 9 ms  │
│ 286714  │ program.ts                                         │ 509      │ 8 ms  │
│ 275867  │ completions.ts                                     │ 636      │ 7 ms  │
│ 267577  │ emitter.ts                                         │ 545      │ 8 ms  │
│ 251797  │ fourslashImpl.ts                                   │ 207      │ 11 ms │
│ 247732  │ editorServices.ts                                  │ 646      │ 6 ms  │
│ 231305  │ es2015.ts                                          │ 979      │ 5 ms  │
│ 218243  │ scanner.ts                                         │ 430      │ 9 ms  │
│ 197298  │ session.ts                                         │ 318      │ 6 ms  │
│ 197277  │ binder.ts                                          │ 495      │ 6 ms  │

checker.ts is the biggest file(3M), it takes 87ms with 6k comments. Do you have any suggestions for monocart-locator?

cenfun commented 2 months ago

I do have 30% improving for performance

source files 585, duration 294 ms
┌─────────┬────────────────────────────────────────────────────┬──────────┬───────┐
│ size*   │ source                                             │ comments │ time  │
├─────────┼────────────────────────────────────────────────────┼──────────┼───────┤
│ 3031387 │ checker.ts                                         │ 6,388    │ 50 ms │
│ 536753  │ parser.ts                                          │ 1,685    │ 7 ms  │
│ 517578  │ diagnosticInformationMap.generated.ts              │ 26       │ 4 ms  │
│ 481679  │ utilities.ts                                       │ 597      │ 5 ms  │
│ 467980  │ types.ts                                           │ 2,435    │ 14 ms │
│ 333866  │ nodeFactory.ts                                     │ 922      │ 5 ms  │
│ 286714  │ program.ts                                         │ 509      │ 4 ms  │
│ 275867  │ completions.ts                                     │ 636      │ 3 ms  │
│ 267577  │ emitter.ts                                         │ 545      │ 4 ms  │
│ 251797  │ fourslashImpl.ts                                   │ 207      │ 9 ms  │
│ 247732  │ editorServices.ts                                  │ 646      │ 3 ms  │
│ 231305  │ es2015.ts                                          │ 979      │ 3 ms  │
│ 218243  │ scanner.ts                                         │ 430      │ 3 ms  │
│ 197298  │ session.ts                                         │ 318      │ 3 ms  │

see commit Do you have any other suggestions?

jakebailey commented 2 months ago

I haven't had the time to look deeply into this, though that commit does look like a good change, yeah. I will likely not be able to look at this for a week or so.

cenfun commented 2 months ago

No problem, Thanks. I will publish a new version of monocart-locator. If there are any new improvement suggestions, we can fix it later.

jakebailey commented 2 months ago

I had some time to do some checking. In findInRanges:

This code just filters the list twice, looking for any entry between start and end. This could just be:

return rangeList.find((it) => startPos >= it[startKey] && endPos <= it[endKey]);

However, this didn't seem to change runtime; likely this would be better off as a binary search, if the input were always guaranteed to be sorted.

For the comment parsing, my first instinct will be to drop use of classes (or at least, methods) for these extremely hot getToken calls; method calls are a whole lot slower than regular calls. Or even not do this abstraction at all, instead just writing out the small handful of tokens directly in the comment parser and switch case or something.

cenfun commented 2 months ago

Thanks for the suggestion about findInRanges, it seems that have almost 5% improvement before

[MCR] generated coverage reports: coverage/index.html (memory: 193.1MB) (18.6s) log
[MCR] generated coverage reports: coverage/index.html (memory: 196.6MB) (18.1s) log

after

[MCR] generated coverage reports: coverage/index.html (memory: 196.8MB) (17.3s) log

And also good suggestion about the comment parsing, I will try your approach later.

cenfun commented 2 months ago

seems it has 40% improvement https://github.com/cenfun/monocart-locator/commit/79721c760db6cfcb2198f9a740edf3d39ca55ad5

source files 585, duration 178 ms  <- previous is 294 ms
┌─────────┬───────────────────────────────────────┬──────────┬───────┐
│ size    │ source                                │ comments │ time  │
├─────────┼───────────────────────────────────────┼──────────┼───────┤
│ 3031387 │ checker.ts                            │ 6,388    │ 24 ms │
│ 536753  │ parser.ts                             │ 1,685    │ 5 ms  │
│ 517578  │ diagnosticInformationMap.generated.ts │ 26       │ 5 ms  │
│ 481679  │ utilities.ts                          │ 597      │ 5 ms  │
│ 467980  │ types.ts                              │ 2,435    │ 16 ms │
│ 333866  │ nodeFactory.ts                        │ 922      │ 3 ms  │
│ 286714  │ program.ts                            │ 509      │ 2 ms  │
│ 275867  │ completions.ts                        │ 636      │ 3 ms  │
│ 267577  │ emitter.ts                            │ 545      │ 2 ms  │
│ 251797  │ fourslashImpl.ts                      │ 207      │ 5 ms  │
│ 247732  │ editorServices.ts                     │ 646      │ 2 ms  │
│ 231305  │ es2015.ts                             │ 979      │ 2 ms  │
│ 218243  │ scanner.ts                            │ 430      │ 1 ms  │
│ 197298  │ session.ts                            │ 318      │ 1 ms  │

cenfun commented 2 months ago

A new version 2.9.0 released, it includes all above improvement. Please try it.

jakebailey commented 2 months ago

Another good improvement; down to 23ish seconds in my testing:

jakebailey commented 1 month ago

What I don't quite get is why the parse here takes 4 seconds; sticking a log into monocart:

        const before = performance.now();
        ast = acorn.parse(source, options);
        console.log(`parsed ${source.length} bytes in ${performance.now() - before} ms (${item.sourcePath})`)

Nets:

$ NODE_V8_COVERAGE=$PWD/coverage/tmp node ./node_modules/c8/bin/c8.js report --experimental-monocart
parsed 12759509 bytes in 3981.204510999999 ms (built/local/run.js)

But then I write this code:

const acorn = require("acorn");
const fs = require("fs");

const source = fs.readFileSync("./built/local/run.js", "utf8");

const before = performance.now();
acorn.parse(source, {
    ecmaVersion: "latest",
    allowReturnOutsideFunction: true,
    allowImportExportEverywhere: true,
    allowAwaitOutsideFunction: true,
    allowSuperOutsideMethod: true,
    allowHashBang: true,
});
console.log(`parsed ${source.length} bytes in ${performance.now() - before} ms`);

And it prints:

parsed 12759509 bytes in 660.767887 ms

No idea why parsing via the vendored code is 4x slower.

jakebailey commented 1 month ago

Marginally related, but it would be nice if monocart didn't vendor dependencies; vendoring them makes the dep count go down, sure, but prevents anyone from upgrading the transitive deps (e.g. new syntax comes out and is supported by acorn) or seeing into the extra bundle.

Unfortunately unvendoring acorn in this instance didn't make it faster! It's still >4x slower for some confusing reason.

cenfun commented 1 month ago

Indeed, the acorn from the vendor is slower, but in my tests, it's 20% slower. I am tring to move some modules out from the vendor. (except the ESM modules that cannot be loaded by CJS)

const { acorn } = require('./monocart-coverage-vendor.js');
const acornNew = require('acorn');

const fs = require("fs");
const source = fs.readFileSync("./built/local/run.js", "utf8");

const options = {
    ecmaVersion: "latest",
    allowReturnOutsideFunction: true,
    allowImportExportEverywhere: true,
    allowAwaitOutsideFunction: true,
    allowSuperOutsideMethod: true,
    allowHashBang: true,
}

const t1 = performance.now();
acorn.parse(source, options);
console.log(`parsed ${source.length} bytes in ${Math.round(performance.now() - t1)} ms (vendor)`);

const t2 = performance.now();
acornNew.parse(source, options);
console.log(`parsed ${source.length} bytes in ${Math.round(performance.now() - t2)} ms`);

node parse-test.js
parsed 12765445 bytes in 530 ms (vendor)
parsed 12765445 bytes in 473 ms

jakebailey commented 1 month ago

Yeah, it's just not clear why it's so much slower in the real program. Maybe some GC timing... Not sure. I don't know if there's a clear fix there so I don't mean for you to go really changing much! So far, we're down to about a 2x increase from plain c8 which is still good for the improved "everything".

cenfun commented 1 month ago

please try monocart-coverage-reports@2.9.1 I think it's more easy to debug if it's not compressed into the vendor.

cenfun / monocart-coverage-reports

Somewhat slow performance processing reports; 33% of time spent in findOriginalRange, etc #47