aquaproj / aqua

Declarative CLI Version manager written in Go. Support Lazy Install, Registry, and continuous update with Renovate. CLI version is switched seamlessly
https://aquaproj.github.io
841 stars 38 forks source link

Convert github_content registries including Standard Registry to SQLite for performance when installing them #2520

Open suzuki-shunsuke opened 10 months ago

suzuki-shunsuke commented 10 months ago

Feature Overview

Convert github_content registries including Standard Registry to SQLite when installing them.

Why is the feature needed?

Similar with https://github.com/aquaproj/aqua/issues/2517 .

To improve the performance to read the standard registry. Stanard registry is a huge YAML file over 30,000 lines and aqua needs to read entire files so it has a little overhead to read it. By converting YAML to SQLite, aqua doesn't need to read all of them. And registry maintainers don't need to do the conversion themselves because aqua converts them internally.

How to reproduce the issue

No response

Workaround

No response

Example Code

aqua converts registry.yaml to registry.yaml.sqlite3 when aqua installs registries. When aqua reads registries, aqua tries to read registry.yaml.sqlite3 first. If registry.yaml.sqlite3 isn't found aqua looks for registry.yaml. If registry.yaml is found, aqua creates registry.yaml.sqlite3. If registry.yaml isn't found aqua installs it.

Reference

suzuki-shunsuke commented 10 months ago

I'm not sure if aqua really gets fast by SQLite3. I'm not familiar with SQLite3, but RDB itself has overhead. Standard Registry is a huge YAML file so SQLite3 may be useful, but almost all local and github_content registries are small so maybe SQLite makes aqua slow. So maybe aqua needs to support both SQLite3 and YAML. SQLite3 support makes aqua complicated. Unlike JSON conversion, I guess we need to fix many code for SQLite3.

Anyway, about performance we should measure, not guess.

sheldonhull commented 9 months ago

I recall we chatted about this in a past discussion. Are you beginning to see performance impact?

Also if it's the size of a single file that's the problem I'm curious if you've thought about instead having the registry be the actual split yaml files without merging to a single and just have Go load all of those from the directory.

curious so no rush in response. There's a few cool local storage packages and I'm interested to see how this works for you.

suzuki-shunsuke commented 9 months ago

Are you beginning to see performance impact?

I don't think so. I saw a little complaint about the performance of aqua on X (formerly Twitter), but I think aqua is enough fast. So I don't have any motivation to change codes drastically for performance. But if we can improve the performance with small changes, it's great.

Also if it's the size of a single file that's the problem I'm curious if you've thought about instead having the registry be the actual split yaml files without merging to a single and just have Go load all of those from the directory.

I didn't thought that. Indeed, the standard registry is split according to the package names (e.g. cli/cli => pkgs/cli/cli/registry.yaml) so aqua can read only necessary files. It's interesting. One of the concerns is that when aliases are used aqua can't find registry.yaml, but this is edge cases.