Open soni801 opened 4 months ago
@PancakeTAS would you be willing to look at this?
What would a healthcheck do other than test if the epochtal process is running? Epochtal doesn't even have a state machine at its core or anything of that sort so how can it not be running?
Well, I don't know the specifics of all the parts of epochtal that need to be working at the same time, but I'd imagine running the following checks:
It also occured to me now that this could be integrated with #54, but I really am not going to fuck with that. That's gonna be a huge MAYBE sometime in the future.
I fully support the idea of a health check, I've wanted to tackle that for a while. The way I see it, you could have a script kind of "simulate" running through the concludeWeek and releaseMap routines to check if there are any potential issues to be encountered. Same could go for run submission - just a script that checks if everything required for submission is working.
A quick and dirty (but accurate!) implementation of this could be running said routines on temporary contexts which mimic the currently active epochtal
context and seeing if that runs into any issues.
I like that approach! However, I'm a bit worried that it'll be more resource intensive than it's worth - on a project that gets as much traffic throughout the entire week as epochtal does, I'd recommend running a health check at least every 10-30 minutes. This way we can get a somewhat immediate notification if anything goes wrong.
For your idea, maybe adding a dry-run
optional parameter to some/all utils is a good idea. If this parameter is true, it doesn't actually modigy anything but still reports if it's successful or not. I think that'd be a really clean approach - then we should also be able to just directly call the routine without messing around with new contexts(?)
You already can do a dry run! Most if not all utils just won't write or read files if you don't have the respective context.file.
entry. They'll just write to the object and report success.
This was done for this very reason of creating temporary contexts and handling them with standard utils.
Having a healthcheck is not strictly necessary, but it can be a nice thing to have for uptime monitoring and similar stuff. The idea is that we can call a command, and it reports whether the running epochtal instance is in a working state or not.
I'd really like to have this for the docker container (#43), which means we need to write it as a command that can can be run from the terminal (for example with
bun run
), and reports the health state of the container through the process exit code:Here's the docker reference to this. Don't worry about the docker implementation, i'll do that, but it'd be nice if someone could ""quickly"" throw together a small script that reports the health as specified.
Edit: I'll do this too if no one else wants to, but I'm notoriously slow at figuring out the best way to implement stuff like this.