Open Malyuk-A opened 1 month ago
Absolutely agree that addressing this issue is critical for further feature development
I think a starting point can be defining a standard for:
{major}.{minor}.{patch}
scheme can be a burden, depending on how often we change the APIs/api/calculate/deploy/
failure reported by both cluster scheduler and system-manager components that manage the requests)I think a way to proceed can be:
On my side, I can start reviewing the root components and apply the described proposal. Of course will require a bit of coordination to whoever will do the same on cluster-level side.
Any ideas/standardization proposal/way to approach the issue is more than welcome! 👍
Edit: testing
That is a fantastic comment @TheDarkPyotr ! To be fair I think Giovanni & I were only talking about API versioning. Your comment covers a suite of necessary steps that include and go beyond versioning. I really appreciate your effort!
I totally agree with your assessment. I would use your comment as a list to generate further tickets instead of having everything in one place.
To add some more ideas to your list:
Python typing/type-hints would be massively helpful IMO for developers to figure out what is going on. (It is not great to read code where you do not know what exactly the input params are and what gets returned)
Regarding 2.i) I highly recommend using this instead of passing around the HTTP response codes as string or ints. Thus combined with type hints one can instantly know what this "status" object is. (I am currently looking into service statuses and oftentimes I do not know if this status object is an HTTP status, a service status, or some other status).
There are a lot of ideas and possible improvements here. The trick is not to get overwhelmed and do small steps of improvement one at a time. (Especially because most of the Oakestra devs are thesis students who can only put as much time into general codebase improvements as their circumstances allow.)
Thank you @Malyuk-A! 🙂 Totally agree that the proposal go a bit too much beyond the API versioning. I think that establish a broader "approach" (that hardly becomes effectively implemented in its entirely from the conception) can clarify the path to follow for new future components (on the long run, obtaining a small-scale document similar to this).
Yours are really great suggestions! 🔥
Absolutely agree that time and resources are limited so it's important to make these improvements manageable. Focusing on action-oriented points, starting simple maybe we can consider using:
Yeah, I have become quite a big fan of Pydantic. I am using it in my FlOps extension and this makes life a lot easier and clearer. E.g. I do not have a convoluted SLA parser setup or DB initialization process, etc. I simply take the received user data and try to instantiate into the pydantic object I need and later based on it create automatically my nested tables in MongoDB. (e.g. FLOpsProject.model_validate(request_data)) and everything works out of the box (arbitrary/custom further checks can be easily added to the class via pydantic). So I truly understand what you are talking about. Just migrating oakestra to pydantic would already be a behemoth of a task. Let's mention this in today's (29.05.24) maintainer's meeting and decide with the team what action points we want to work on.
Short
The Oakestra APIs should have strong versioning to avoid unexpected breaking changes and easier extendability/modifiability.
Proposal
Rather self-explanatory. We need to add/make sure to have comprehensive & documented rules for using versioning and enforce their usage in our codebases.
Context: @giobart and I discussed the current state of our codebase and he mentioned that this is a major thing we should work on. @melkodary @TheDarkPyotr
Impact
Every component that exposes APIs, Documentation
Development time
Depends on how we split the effort. Including documentation, initial discussions for best practices to follow, etc. This can take a couple of weeks. The implementation itself is rather quick. (Figure out once, apply everywhere)
Status
Looking for discussion/feedback, best practices, and how to split the workload.
Checklist