kubeshop / vscode-monokle

An extension for Visual Studio Code to validate your Kubernetes configuration
https://marketplace.visualstudio.com/items?itemName=kubeshop.monokle
MIT License
6 stars 0 forks source link

Manifests scanning performance improvements #68

Closed f1ames closed 8 months ago

f1ames commented 8 months ago

This PR fixes https://github.com/kubeshop/monokle-saas/issues/2258.

TL;DR: There are not many changes here, since the main issue was solved by recent PR. See Analysis below.

Changes

Fixes

Analysis

A bit on files finding

What is used now for scanning for manifests is native workspace.findeFiles. It takes into account files ignored in VSC via files.exclude but does not look on .gitignore.

Now measuring times for monokle-saas repo, it seems not finding files but parsing them takes most time: image and interestingly we got 50 resources from 837 files, which means most files are not K8s resources (I quickly check and just by ignoring node_modules we get to 137 files and less than 5 seconds).

Still the initial issue is not about a time, but resources. And finding yamls is the only place we do any find with globbing so this is a solid candidate for triggering resource intensive rg calls (and first thing to check is if excluding large dirs, like node_modules, makes it less CPU demanding).

On rg

I tested how rg behaves when switching branches with monokle-saas repo (from some old feature branch to main). But what I noticed is that it mostly happen when there are lots changes to pull, when switching to new branch for the first time. So the procedure for me was:

  1. Install specific extension version (from marketplace or local build).
  2. Open local test repo in VSCode.
  3. Switch to main branch (with latest changes) in local test repo.
  4. Restart VSCode.
  5. Switch to new branch/tag (old one, which is not checkouted locally).
  6. Run Monokle: Validate.

0.6.5

First, with current 0.6.5 version and results are similar as mentioned in initial issue, it goes wild a bit (output from atop with 1s interval, each screenshot is next snapshot):

image image image image

With file watcher changes

There were significant changes how files are processed in https://github.com/kubeshop/vscode-monokle/pull/62. So I also tested with those changes (not released yet). Especially, we got rid of inefficient file finding logic - notice line 122 and 142 below:

https://github.com/kubeshop/vscode-monokle/blob/63a09846c850fa6e12158b15ebeef1baad616a17/src/utils/workspace.ts#L121-L142

☝️ So for every workspace there was a watcher using globbing (L122). And one thing which could happen is when you switch branches it was triggered for multiple files. And for each file it will get resource ids (L142).

https://github.com/kubeshop/vscode-monokle/blob/63a09846c850fa6e12158b15ebeef1baad616a17/src/utils/workspace.ts#L216-L222

☝️ Now getting resource id uses findYamlFiles for each file.

https://github.com/kubeshop/vscode-monokle/blob/63a09846c850fa6e12158b15ebeef1baad616a17/src/utils/workspace.ts#L171-L184

☝️ And findYamlFiles does scanning entire workspace again 😓 🙈 Sound like a good party...

Anyways, as mentioned this logic was reworked entirely. The results with new logic:

First run:

image

Second run:

image

This looks really good. Unfortunately, found a related regression too - #70. And even though as part of the test procedure Monokle: Validate is run, with the regression I also checked for rg processes during Monokle extension initialization to also check how first repo scanning behaves:

image image image image

Still no rg party which is good 👍

Checklist

WitoDelnat commented 8 months ago

Thank you for this thorough write-out. Great to see that our previous refactor solved the performance problem as well. Let's merge this deference of validation during initialisation, it's a neat little improvement and these things add up over time.