Discussion: Is it possible to add return value to heap-analysis plugins?

MoonShadowV commented 1 year ago

Hi team,

Thanks for a lot for let this amazing project open source!

I am from the Lark desktop Team of ByteDance, and we would like to use memlab to help us find memory leaks.

In our scenario, Lark is a desktop app, not a SPA webpage, we can't use memlab to start a puppeteer and run the e2e test, and the cost to rewrite all of out test cases using memlab is too high to afford.

Thus we want to use our existing e2e framwork to take snapshot, and pass it to a node server running memlab to analyse it.

Here is a minimal demo running memlab at a node server https://github.com/MoonShadowV/memlab-demo-server

It works fine for findLeaks as it dumps the result to the disk => /data/out/leaks.txt

import { findLeaks, BrowserInteractionResultReader } from '@memlab/api';
import { MEMLAB_WORKDIR } from './constant';
async function memlabFindLeak() {
  try {
    const reader = BrowserInteractionResultReader.from(MEMLAB_WORKDIR);
    const leaks = await findLeaks(reader);
    return Boolean(leaks);
  } catch (error) {
    throw new Error(`memlab analyse failed: ${error}`);
  }
}

export { memlabFindLeak };

But if we want to analyse single snapshot, the heap-analysis plugin return nothing, just print it to the termianl. We can't get the output.

eg: https://github.com/facebook/memlab/blob/main/packages/heap-analysis/src/plugins/DetachedDOMElementAnalysis.ts#L50 In this plugin, the process function just call

  async process(options: HeapAnalysisOptions): Promise<void> {
    const snapshot = await pluginUtils.loadHeapSnapshot(options);
    this.detachedElements = pluginUtils.filterOutLargestObjects(
      snapshot,
      utils.isDetachedDOMNode,
    );
    pluginUtils.printNodeListInTerminal(this.detachedElements); // print result to terminal, return nothing
  }

to print the detachedElements in the terminal.

we can not access it from outside. ↓

(async function () {
  const analyse = new DetachedDOMElementAnalysis();
  await analyse.analyzeSnapshotFromFile(`./s2.heapsnapshot`); // just return null
})();

Is it possible to add return value to heap-analysis plugins so that we can to access it's result in the code?

(async function () {
  const analyse = new DetachedDOMElementAnalysis();
  // get the detachedeElements
  const result = await analyse.analyzeSnapshotFromFile(`./s2.heapsnapshot`); 
})();

JacksonGL commented 1 year ago

Hi @MoonShadowV Thanks for sharing the context.

analyse.analyzeSnapshotFromFile returns void mainly because it's a method defined in BaseAnalysis extended by all heap analyzers, each of which may generate a different type of output.

The current solution is to provide analysis-specific API to query the result after calling analyse.analyzeSnapshotFromFile.

Here is an example: https://facebook.github.io/memlab/docs/api/classes/heap_analysis_src.ObjectSizeAnalysis/#analyzesnapshotfromfilefile

I am open to hear if there are other ideas. We can add a query API for DetachedDOMElementAnalysis if that works for your use case.

MoonShadowV commented 1 year ago

Hi @MoonShadowV Thanks for sharing the context.

analyse.analyzeSnapshotFromFile returns void mainly because it's a method defined in BaseAnalysis extended by all heap analyzers, each of which may generate a different type of output.

The current solution is to provide analysis-specific API to query the result after calling analyse.analyzeSnapshotFromFile.

Here is an example: https://facebook.github.io/memlab/docs/api/classes/heap_analysis_src.ObjectSizeAnalysis/#analyzesnapshotfromfilefile

I am open to hear if there are other ideas. We can add a query API for DetachedDOMElementAnalysis if that works for your use case.

Hi @JacksonGL Thanks for your reply!

Add a query API for heap-analysis will help a lot! But it will need to add query API to all of the heap-analysis plugins, which seems will cost a lot of time?

We want to integrate memlab into our CI/CD pipeline. In our scenario, the least intrusive way is to expose the capabilities of memlab through the interface of the node server.

Backgroud

The fundamental reason why we need to get the result is that now the result is printed on the terminal, and we have no way to get the messages(The heap-analysis result) printed to the terminal.

We need a way to access the memlab's result and pass it to the next CI pipeline which will generate a report to our QA&RD.

(we can modify the cosnole.log method by monkey patch it,change it's behavior to dump the log to the disk. But the log of memlab and the log of the server will be mixed together).

In a nutshell, we just want get the messages that memlab print on the terminal.

Possible solution?

I think another way with a lower cost is to modify the logger method of memlab, let it not only print on the terminal, but also dump the log to the disk. https://github.com/facebook/memlab/blob/7bd664bbeefabcb9cca6df1468f4f747b49392d8/packages/core/src/lib/Console.ts#L246

But it seems hacky.

So there are two ways to solve the problem, or is there another better solution?

add query API to all of the heap-analysis plugins
modify the logger method of memlab

I can contribute code to this, but it's up to you to figure out the solution.

JacksonGL commented 1 year ago

@MoonShadowV I can enable memlab to dump all memlab's console output to

$(memlab get-default-work-dir)/data/cur/console-log.txt

This logging-to-disk feature is currently only enabled for MemLab CLI, we can also enable it for MemLab JS API.

MoonShadowV commented 1 year ago

@MoonShadowV I can enable memlab to dump all memlab's console output to
$(memlab get-default-work-dir)/data/cur/console-log.txt
This logging-to-disk feature is currently only enabled for MemLab CLI, we can also enable it for MemLab JS API.

Awesome!

If memlab can enable it for JS API，for example

(async function () {
  const analyse = new DetachedDOMElementAnalysis();
  analyse.setWorkDir('./tmp');// just for demonstration, doesn't represent the final implementation
  await analyse.analyzeSnapshotFromFile(`./s2.heapsnapshot`); // dump the log to "./tmp/data/cur/console-log.txt"
})();

In this way, we can easily read and parse the output using Node JS, which helps a lot.

JacksonGL commented 1 year ago

@MoonShadowV I just landed a few changes:

memlab will dump all console output to the console log file (when called through API):
```
$(memlab get-default-work-dir)/data/cur/console-log.txt
```
- Each await analysis.analyzeSnapshotFromFile(snapshotFile) call or analyzeSnapshotsInDirectory(snapshotDirectory) call will return a result object containing a analysisOutputFile property that points to a file containing the console output for that specific heap analysis call. E.g.,
```
// analysis console output is saved in result.analysisOutputFile
const result = await analysis.analyzeSnapshotFromFile(snapshotFile);
```

// console output during the heap analysis is dumped in the following file location result.analysisOutputFile



Let me know if that suits your needs or if anything doesn't work as expected.

MoonShadowV commented 1 year ago

@MoonShadowV I just landed a few changes:

memlab will dump all console output to the console log file (when called through API):
$(memlab get-default-work-dir)/data/cur/console-log.txt
Each await analysis.analyzeSnapshotFromFile(snapshotFile) call or analyzeSnapshotsInDirectory(snapshotDirectory) call will return a result object containing a analysisOutputFile property that points to a file containing the console output for that specific heap analysis call. E.g.,
// analysis console output is saved in result.analysisOutputFile
const result = await analysis.analyzeSnapshotFromFile(snapshotFile);

// console output during the heap analysis is dumped in the following file location
result.analysisOutputFile
Let me know if that suits your needs or if anything doesn't work as expected.

@JacksonGL

This works for me~

But I think we can make the api more flexible by adding optional parameters to analyzeSnapshotFromFile and analyzeSnapshotsInDirectory?

So we can pass the FileOption param to reset work-dir, which makes the JSAPI behave the same as the CLI API.

  public async analyzeSnapshotFromFile(
    file: string,
    options?: FileOption // allow us override the default work-dir
  ): Promise<AnalyzeSnapshotResult> {
    const analysisOutputFile = fileManager.initNewHeapAnalysisLogFile(options);
    info.registerLogFile(analysisOutputFile);
    await this.process({
      args: {
        _: [],
        snapshot: file,
        'snapshot-dir': '<MUST_PROVIDE_SNAPSHOT_DIR>',
      },
    });
    info.unregisterLogFile(analysisOutputFile);
    return {analysisOutputFile};
  }

For examle, We can reset the work-dir using cli, but currently we can not reset work-dir using JS API.

memlab analyze object-size --work-dir='./heap' --snapshot='./s2.heapsnapshot'

The work-dir will be $(memlab get-default-work-dir), and we can't change it.

(async function () {
  const analyse = new ObjectSizeAnalysis();
  const res = await analyse.analyzeSnapshotFromFile(`./s2.heapsnapshot`);
})();

JacksonGL commented 1 year ago

That makes sense. We can add the option that supports setting an optional working directory for heap analysis calls

JacksonGL commented 1 year ago

@MoonShadowV I added support for setting optional working directory. Let me know if the new API is flexible for your use case.

facebook / memlab

Discussion: Is it possible to add return value to heap-analysis plugins? #52

Backgroud

Possible solution?