ohwgiles / laminar

Fast and lightweight Continuous Integration
https://laminar.ohwg.net
GNU General Public License v3.0
298 stars 54 forks source link

Storing list of artifacts in database #195

Open KAction opened 1 year ago

KAction commented 1 year ago

Currently, Laminar lists $ARCHIVE directory of particular run every time to display links on the top:

    KJ_IF_MAYBE(dir, fsHome->tryOpenSubdir("archive"/runArchive)) {
        for(kj::StringPtr file : (*dir)->listNames()) {
            kj::FsNode::Metadata meta = (*dir)->lstat(kj::Path{file});

which means I can't move files in $ARCHIVE elsewhere without loosing these links. And want to move these files to cheaper storage.

So I suggest that Laminar saves list of artifacts right after job finishes, and it is up to reverse proxy to figure out where to find laminar.example.com/archive/foo-job/10/debug.txt. What do you think?

ohwgiles commented 1 year ago

The original idea supports moving archived artefacts to cheaper storage but assumed this would be achieved by mounting or symlinking the archive directory appropriately. It's simpler to just dynamically iterate the folder, but iterating is more expensive especially on slow storage or if there are many artefacts. I'm not opposed to your suggestion, just wanted to check why mounting/symlinking would not work for you since this proposal has some (admittedly low) added complexity versus the current situation

KAction commented 1 year ago

If I want to archive artifacts on S3, mounting them so Laminar finds them means FUSE -- already extra complexity. Furthermore, scanning S3 to render a job page (list of artifacts on the top) is both slow and costly.

Technically, I can keep empty files in /laminar/archive to inform Laminar about what artifacts are associated with the job, yet configure the reverse proxy to go to S3 instead, but that means using the filesystem as a database. Huge pain to back up. readdir(3) won't be happy.

ohwgiles commented 1 year ago

Fair enough

ohwgiles commented 1 year ago

@mitya57 are you still offering a PR for this? I'm happy with the justification

KAction commented 9 months ago

Sorry for the late response.

@mitya57 ended up with postgres-only fork. https://github.com/mitya57/laminar/tree/wip/postgres

That was necessary to speed up the things by taking advantage of Postgres materialized views and other nice features. Patch to keep the artifact list in the database ended up tightly coupled to other changes, so, I guess we can close the issue.