Open luator opened 11 months ago
The cli-part is also covered by #32.
By Felix Widmaier on 2023-11-27T11:16:23 (imported from GitLab)
Also somewhat related to this: It always bothered me a bit that the repo is called "cluster_utils" (and we usually refer to it by that name) but the package is just named "cluster". Would it be okay to rename the package to "cluster_utils" (but maybe keep the command name as cluster
in #32)? This would be a breaking change but one that should be relatively easy to handle (imports would need to be renamed).
By Felix Widmaier on 2023-11-28T10:37:50 (imported from GitLab)
My suggestions for sub-packages, if we do a larger refactor:
core
(maybe also scheduling
): core functionalities regarding job scheduling. job
, job_manager
, potentially optimizers
client
(maybe a different name): the API that jobs are using. settings
(which should be renamed as it is very misguiding), submission_state
.reporting
: everything regarding reports. report
, latex_utils
, data_analysis
utils
: the rest. git_utils
, user_interaction
, progress_bars
, utils
constants
: don't know where to put this. Maybe it can stay top level, maybe it can be removed.It may not be necessary to have that many sub-packages, but generally I like the additional organization that comes with it. Also, maybe this is not the best way of organizing it in terms of the dependency structure (e.g. core
probably should not depend on too many other packages, but currently it would). This might also give us further hints towards where the current module structure is not optimal, and how we could better encapsulate things. What do you think?
Besides, I think parallel_executor
can be removed, as it is never used. But we should ask first.
By Maximilian Seitzer on 2023-11-28T11:32:10 (imported from GitLab)
Yeah, I also always found this choice strange. Maybe we can rename it and keep an additional package cluster
which forwards the imports appropriately for a deprecation period.
By Maximilian Seitzer on 2023-11-28T10:37:50 (imported from GitLab)
Sounds good
By Felix Widmaier on 2023-11-28T11:21:38 (imported from GitLab)
I don't really have a complete enough overview of the whole package yet to fully judge but it looks reasonable to me. I'd suggest that we take it as a guideline and adjust if we see that some parts don't work out like that.
Regarding constants
: I think it can stay but should only contain constants that are actually used in multiple places. For example the ones for the condor submission scripts could rather be moved to the condor module (as that's the only place where it's used).
By Felix Widmaier on 2023-11-28T11:32:10 (imported from GitLab)
With #83 in mind, I would suggest the following base structure:
client
: API used in the job scripts (already done).server
: Everything only needed for running the server process.base
(or maybe a better name): Everything that is used by both client and server.Inside those, we can potentially have further sub-packages based on the discussion above.
The standard dependencies would then cover only packages needed by base
and client
, all additional dependencies of server
would go to an optional dependency group.
Where would the CLI entry points for the user go? Into scripts/
, cli/
, server/cli
? I think I would vote for cli/
.
I think they should better go to server/cli
, as they are part of the server stuff. However, I would add proper entry points, so that they can be called directly (something like cluster_utils_grid_search
/cluster_utils_hp_optimization
). Or go a step further and implement the suggestion of #32.
Add some structure to the source folder of the package:
cluster_system
scripts
tocli
and move all executable files there