OSD Optimizer: Improve strategy for detecting stale compilation targets

Swiddis commented 1 month ago

Is your feature request related to a problem? Please describe.

Coming from https://github.com/opensearch-project/dashboards-observability/issues/2188, I'm hoping to implement caching for the compilation step of OSD, because this takes more than 10 minutes per CI run. Across all of our integ test workflows, that's multiple hours of compute time per code change.

I found that osd-optimizer determines whether to build files based on the last-modified time (mtime), instead of being based on the actual content of the file. This means that the optimizer can never work in CI, because pulling code will always update the mtime.

Describe the solution you'd like

In .osd-optimizer-cache, instead of the cacheKey depending on mtimes

{
  // ...
  "cacheKey": {
    // ...
    "mtimes": {
      ".../node_modules/@elastic/apm-rum-core/node_modules/error-stack-parser/package.json": 1689369687625,
      ".../node_modules/@elastic/apm-rum-core/node_modules/stackframe/package.json": 1689369687854,
      ".../node_modules/@elastic/apm-rum-core/package.json": 1689369687355,
      ".../node_modules/@elastic/apm-rum/package.json": 1689369687118,
      ".../node_modules/@elastic/charts/dist/theme.scss": 1727807816498.0383,
      ".../node_modules/@elastic/eui/src/global_styling/functions/_colors.scss": 1725572246345.999,
      ".../node_modules/@elastic/eui/src/global_styling/functions/_index.scss": 1725572246345.999,
      ".../node_modules/@elastic/eui/src/global_styling/functions/_math.scss": 1725572246345.999,
      // ...
    }
  },
  // ...
}

Base it on a hash of these files' content (e.g. sha256, blake3, xxhash, siphash...)

{
  // ...
  "cacheKey": {
    // ...
    "file_hashes": {
      ".../node_modules/@elastic/apm-rum-core/node_modules/error-stack-parser/package.json": "f90bcab5e738adcc6cbfe48280633df341a36bfb7a0344903e9122023c86f9b2",
      ".../node_modules/@elastic/apm-rum-core/node_modules/stackframe/package.json": "fd31383a3ec16fbc09be017856b4438c9b8c33f134cf6be25979953d56f5ec73",
      ".../node_modules/@elastic/apm-rum-core/package.json": "6b1cd6ad20a0b7f351dd838b6365b2fcd9cfa8c4185c9e61fd4dccfef2034ae6",
      ".../node_modules/@elastic/apm-rum/package.json": "3135c78159ac6dcf0c00f7b0c21a6c85b4b4e67805f5cf7f766bff10b63f9eb7",
      ".../node_modules/@elastic/charts/dist/theme.scss": "116e43edbab9d1b81e47ad1391b7b159ba7e9546e4d6462a606fdd8057fb6422",
      ".../node_modules/@elastic/eui/src/global_styling/functions/_colors.scss": "22739aa86cde21eac0c9685e61b4c94d986fcb69193c6f8f28d1ebafcae975f1",
      ".../node_modules/@elastic/eui/src/global_styling/functions/_index.scss": "e825b6dccb6a688ca0109c52a632beb90e365bf4463fef1a02a6cfedd438078f",
      ".../node_modules/@elastic/eui/src/global_styling/functions/_math.scss": "866ad28b44e11c7d3c08e1ef65cf5ce0bc3dc35a8bbd3410e9a4baa6b79ea905",
      // ...
    }
  },
  // ...
}

This makes it possible to cache build artifacts even when moving files between machines. Instead of actual_mtime > cache_mtime, check actual_hash != cache_hash.

Describe alternatives you've considered

N/A

Additional context

N/A

Swiddis commented 1 month ago

Experimented with this more on the downstream issue, see https://github.com/opensearch-project/dashboards-observability/issues/2188#issuecomment-2389541496. I experimentally verified that there's no build time savings at all when using the current cache system.

osd_build

Swiddis commented 1 month ago

Proof of concept that works locally, I replaced the getMtimes method in-place to compute hashes instead of the actual mtimes: https://github.com/Swiddis/OpenSearch-Dashboards/commit/1e2529c8e26fc220a38cde2b3697e57a1be0016d.

Only issue is that it'd never get through PR review to keep the old naming of everything. Also need to figure out how to test it in the real build.

Swiddis commented 1 month ago

Set up some job runs connected to the POC:

It works as expected. (That's not a vertical line -- it takes 0.4 seconds to hash all of the source files) osd_build_2

Swiddis commented 1 month ago

To avoid all the renaming work, we can just say MTIME stands for "Modern Tracking of Incremental Modification Events"

opensearch-project / OpenSearch-Dashboards

OSD Optimizer: Improve strategy for detecting stale compilation targets #8428