artus-analysis / grid-control

github mirror for the grid-control job submission tool
https://ekptrac.physik.uni-karlsruhe.de/trac/grid-control
0 stars 3 forks source link

New developments in grid-control #1

Closed rfriese closed 8 years ago

rfriese commented 8 years ago

Dear all,

there have been two rather large commits in grid-control:

a543b12 29d5c4e

I tried merging them into our master, but I got merge conflicts. Can maybe someone try to pick them up? On the other side, if our developments are useful for all grid-control users it might be good to do a PR?

Cheers, Raphael

thomas-mueller commented 8 years ago

Hallo Raphael, Georg,

@claria: Schoen, dass du weisst, wer da zustaendig ist. Willst du nicht Raphaels Merge-Konflikte loesen, wenn er schon so freundlich darum bittet? ;-)

@rfriese: What is the reason, why we need this?

In most cases we have problems with updates in GC, because Fred does fundamental changes in rather short intervals. For Artus and for our group we aim for a stable version that supports all our needs. (And I do not know of any use case, that does not work with the current HEAD.) Additionally, we needed some developments in GC, that Fred did not want to see in "his" official version. This usually appears, when you ask Fred for a solution and he says, that he plans a major update in the next 10 years, where everything will work much more elegantly. Then he proposes a "short term" solution, that we implement in our fork.

Cheers, Thomas

thomas-mueller commented 8 years ago

This is such an example: https://github.com/artus-analysis/grid-control/commit/9c0e71a7e4fa5f56790b8bcbe4107a7276ae7fd7.

claria commented 8 years ago

Hi Thomas,

this is/will be a general problem, if we start forking projects and adding our quick&dirty hacks for problems we find instead of fixing them in upstream. I understand that it is hard to get 'your' quick solution merged into the upstream grid-control, but it will get more and more tedious to keep up to date when we add more developments in 'our' fork instead of contributing to the official grid-control repo. And for sure we want to profit from the latest bugfixes/features in grid-control, so just not-merging seems to be a bad move.

thomas-mueller commented 8 years ago

I agree, but this would also require a change of the attitute, GC is treated. I do not ask every month for accepting my pull requests or whether the future has already begun and we can work on an elegant solution. If GC ist kept so closed as it is now (In contrast to for example Artus), I see no other way than forks.

And for an update there must be good reasons and is has to be proven first, that our applications all still work.

ArturAkh commented 8 years ago

The last time I merged master into this fork, the only difference between the two repositories was exactly the commit Thomas has linked two messages before. On my understanding, this commit is only needed for artusMergeOutputsWithGC.py to work properly. Are there any other use-cases for this commit?

By the way, the merge conflict is only two lines (git pull ---> fail; look in gitk):

index 86b9956,9ad2fa2..0000000 @@@ -24,8 -26,7 +26,12 @@@ from grid_control.utils.data_structure from grid_control.utils.file_objects import SafeFile, VirtualFile from grid_control.utils.gc_itertools import ichain, lchain from hpfwk import AbstractError, NestedException ++<<<<<<< HEAD +from python_compat import imap, izip, lmap, set, sorted +import shlex ++======= +from python_compat import identity, imap, izip, lmap, set, sorted ++>>>>>>> 29d5c4ed68822351868766e884fd0d55c7644add

class BackendError(NestedException): pass

Since this is the only difference, I guess there will be no problems, if the official version is tested in all use-cases and works there. Currently, I'm not using grid-control, so I would ask the people using it to test their jobs with the official version and give an ok here. Then the "merge conflict" can be solved very fast.

After the merge conflict is solved, one has only to test, whether artusMergeOutputsWithGC.py works properly.

Cheers,

Artur

FredStober commented 8 years ago

Hi, some points:

Concerning closed development:

About these commits - they are indeed quite large.

The first commit is yet another part of the long-announced refactoring of the backends, this is mostly limited to the backends, but also touches some part of the job db and job management, since the backend API changed slightly. Since the API changed in a quite radical way, there was no way to split the commit into smaller parts than it is - however there WILL be a backend refactoring part III (and maybe IV), since the submission and output retrieval is yet untouched.

I tried to disentangle the 2nd into ~20 commits - but it was very hard to keep the different changes separate, while keeping the whole stuff working, so I collapsed them, since these partial commits couldn't be used anyway and it just was too much work. They mostly contain changes (even refactoring and rewrites) with the goal of pushing the testsuite coverage for all files above 95% (which can be a challenge - see https://codecov.io/gh/grid-control/grid-control/tree/458b6fa9661dfb97b84949cf78e6747367042718). On the way many small issues have appeared and got fixed.

The reason these two commits are needed by Raphael is because they fix some problem with report.py that was introduced by some recent changes. Since report.py was not yet in the .travis.yml file this didn't get flagged by Travis CI or landscape.io (for some reason). So it would probably help if you see something missing in .travis.yml (or in the stress test config files docs/examples/ExampleS*.conf) to add a line there.

rfriese commented 8 years ago

artusMergeOutputs.py now has a -b option. I did only test it on the NAF.

With that, every Artus User is free to switch to the latest GC version.

rfriese commented 8 years ago

Sorry, one comment: This only works with this patch:

diff --git a/packages/grid_control/backends/wms_gridengine.py b/packages/grid_control/backends/wms_gridengine.py
index afaa6b5..880f616 100644
--- a/packages/grid_control/backends/wms_gridengine.py
+++ b/packages/grid_control/backends/wms_gridengine.py
@@ -136,7 +136,9 @@ class GridEngine(PBSGECommon):
            fmt = lambda wmsIDs: [str.join(',', wmsIDs)], unknownID = ['Unknown Job Id'])
        PBSGECommon.__init__(self, config, name,
            cancelExecutor = cancelExecutor,
-           checkExecutor = CheckJobsMissingState(config, GridEngine_CheckJobs(config)))
+           checkExecutor = CheckJobsMissingState(config, GridEngine_CheckJobs(config)),
+           nodesFinder = GridEngine_Discover_Nodes(config),
+           queuesFinder = GridEngine_Discover_Queues(config))
        self._project = config.get('project name', '', onChange = None)
        self._configExec = utils.resolveInstallPath('qconf')

This will come soon in grid-control

FredStober commented 8 years ago

It is fixed now in 69c7a898cba9e9773de1b9051d1f9cba90d30dee (r1937).