R3BRootGroup / R3BRoot

Framework for Simulations and Data Analysis of R3B Experiment
https://github.com/R3BRootGroup/R3BRoot/wiki
GNU General Public License v3.0
18 stars 104 forks source link

Policy discussion: on pull requests, merging and reviews #891

Open klenze opened 1 year ago

klenze commented 1 year ago

There were some discussions in the phase zero meeting today regarding the feasibility of importing features from PRs currently awaiting acceptance.

There was an almost consensus (with @YanzhaoW vocally dissenting) that expecting the users to import stuff from multiple feature branches, then implement their own feature on top of that, and rebase to dev when these features are merged is not feasible for most users.

What I would propose instead is a three step process.

First comes a speedy review which focuses on the following points:

If the PR passes these questions, it is tentatively merged (with the source files clearly marked as preliminary) and a full review is scheduled. Anyone can use the new features, but should be aware that the API might still change.

Then comes the second in depth review.

The original author will work with the reviewers to bring the implementation up to our standard. If the author can't or won't make changes, it is up to other people interested in the feature to fix things. If authors and reviewers can't agree, maintainers will decide.

If change requests are not addressed by anyone, the feature is removed from dev.

All preexisting code should also be scheduled for review. I am sure @YanzhaoW would have some comments on the task "template" files, for example.

--

Next proposal:

Furthermore, each class should have at least one sponsor. That is someone who needs the feature, understands the code (or at least can spend the time to understand it when required), will review relevant PRs and fix stuff when APIs change.

If there is no sponsor for a class it gets removed. If someone needs that feature again ten years down the line, let them copy it from the version history.

YanzhaoW commented 1 year ago

Very nice. I also intended to open an issue about what we have discussed in the meeting. But now you have done that for me. Thanks.

But let me first list few facts everybody agrees upon:

I generally agree with your proposals. But I have some questions about the detail:

First comes a speedy review which focuses on the following points:

Here is my quesitons about this:

  1. How long can be called "speedy"?
  2. What if the reviewer is not available at that moment?
  3. What if the person who made the PR is not available after?
  4. What if the time needed to give or answer those points is considered "too long"?

There could many factors that drag this process "too long". If this happens, some people begin to ask "The code is woking now and why has it not already been merged to dev branch?" Then everything boils down into this single question: Should we merge a PR into our dev branch if the code in the PR is not a clear improvement, the class design is not reasonable, it has obvious bug inside or it has a unreasonable interface design but the PR has been made long time ago and the code is working at the moment? For me personally, the answer is clearly no.

If change requests are not addressed by anyone, the feature is removed from dev.

I guess you meant "revert" the feature instead of "removed". But what if there are other features already being merged that are depending on the feature to be reverted? Let's say if we made a commit A, and thanks to our speedy merging policy, commits B and C, which are depending on commit A are also merged to the dev branch. Now someone found an issue about commit A and nobody takes care about that. In this case, apart from reverting A, do we need to revert B and C as well? Sometimes, the dependencies among commits are not trivial and this could be a huge task for our code maintainer.

klenze commented 1 year ago

How long can be called "speedy"?

Hm... a week perhaps?

What if the reviewer is not available at that moment?

I don't think that for the speedy review the reviewer needs deep understanding of the detector system. Any experienced software designer should be able to check these points. (Also, one could alert all of the sponsors of the code in question.)

What if the person who made the PR is not available after?

Then it falls to the people who need that feature to maintain it. If nobody wants to do that, then by revealed preference nobody needs the feature.

What if the time needed to give or answer those points is considered "too long"?

Ideally, we would have more than one paid reviewer. In that case a review within a week might be doable. (If all reviewers are on vacation or there is a beam time on it might take longer.)

I guess you meant "revert" the feature instead of "removed".

Yes.

If reverting a feature A break feature B, then the sponsors of B can either adopt A or remove B's dependence on A. If they do neither, then B gets reverted for lack of sponsors.

The reason for my "unsponsored code gets removed" proposal is that with our present policy of "never deleting anything", we accumulate lots of dead code. This is a burden, especially if one tries to fix clang-tidy issues.

Meta: As long as only @YanzhaoW and me are discussing here, we could as well argue about Latin American fiscal policy for all the impact we will have. At the moment, PRs will get merged when they get merged, and code reviews are a strictly optional thing which happen when someone volunteers to review code. Once code is in, the maintainers will make sure that it stays compilable (e.g. by changing the case of the emum argument to LOG).

hapol commented 1 year ago

Latin American fiscal policy is quite boring, I can say. Latin American politics could be more interesting. No strong opinion on my side on the topic under discussion. I would say that we have already opposite views and the discussion is probably going to thermalise in a reasonable suggestion for a policy. To my understanding, once agreement was set (corrections done, reasonable design, clear improvement, ...), the final merge on dev was done on a reasonable time by the release manager (Dmytro and me formerly, Jose Luis now), but a recommendation on the whole review procedure is very welcome.

YanzhaoW commented 1 year ago

Hm... a week perhaps?

Minimally a week is fine. But should we have different time requirements for different types of PR? Some PRs are just bug fixes, which only have few lines of changes. Some PRs are new features, which could have new classes or old classes rewritten. Some PRs could have a huge update, which contains thousands of lines of changes. Even for the PRs relating to new features. There could be different level of significances. Some new features are added to the detector specific directory, which doesn't impact on the people using other detector directories. Some new features are added to the common directories like r3bbase, which is shared by everyone. Should we impose one week on all those PRs?

I don't think that for the speedy review the reviewer needs deep understanding of the detector system. Any experienced software designer should be able to check these points.

Reviewing won't take much time. But if the submitter doesn't have the experience/knowledge to change code to the way required. Then it would be necessary for the reviewer to help submitter to do that. In this case, that could be time consuming for the reviewer.

Then it falls to the people who need that feature to maintain it. If nobody wants to do that, then by revealed preference nobody needs the feature.

In this case, we could agree the PR stays unmerged until the necessary requirements are attended by the submitter.

If reverting a feature A break feature B, then the sponsors of B can either adopt A or remove B's dependence on A. If they do neither, then B gets reverted for lack of sponsors.

What if feature B and feature C both relies on feature A. The sponsor of feature B doesn't like feature A (e.g. found a problem) and he wishes to revert feature A and B. But on the other hand, the sponsor of feature C, which is also depending on feature A, doesn't want to be reverted. In such case, should we prefer the reversion or not?

But generally I think the idea of sponsor is kind of nice. The main problem, of course, is still lack of manpower and required skills. Of course, skills can be learned and trained. But I'm afraid we are also lack of people who are even willing to learn what is good and what is bad in software design.

klenze commented 1 year ago

@hapol: okay, I will try to let my takes approach room temperature :-)

But should we have different time requirements for different types of PR? Some PRs are just bug fixes, which only have few lines of changes.

Sorry, I was talking specifically about larger PRs. A simple bug fix (initialize a previously uninitialized variable, add an if (p==nullptr) LOG(fatal) ... and the like) or a revert can be merged as soon as CI has passed (but should still ideally be reviewed post-merge).

In theory, a bug fix can also be complicated (e.g. if the bug is inherent in the design), but this should be rare.

I don't directly follow your story. Is it like this:

B depends on A. C depends on A. A gets updated to a version A'. Sponsor of B prefers A, sponsor of C prefers A'.

<strikethrough> Easy. Both sponsors shall meet at the first Monday at least three days after their disagreement at noon just south of the Targethalle, pick up A4 printouts of R3BRoot source code and try to physically defeat their opponent with them. The winner gets to decide if the code is reverted or kept. </strikethrough>

Seriously, I don't think there is a good way to codify each eventuality. I think it is fine to say "the maintainers will decide what happens", which is what will basically happen in any case.

More abstractly, I think that there are different possible failure modes, e.g.

Nobody would suggest leaving the distribution of 230V AC power in our cave to some PhD students with wires and soldering irons. While the stakes are thankfully lower for our kind of software, the fact remains that writing good C++ is not something one can pick up in a week or two.

(Personally, I think PyRoot is probably the least painful way to interact with ROOT. In terms of safety, Python is more like 12V DC where C++ is like mains power. While it makes sense to keep a core of C++ classes to do the heavy lifting, I would much prefer to move e.g. the calibration (as in "finding the correct parameters") and individual analyses to PyRoot so that the average student can get by with writing ok python rather than bad C++.)

YanzhaoW commented 1 year ago

Seriously, I don't think there is a good way to codify each eventuality. I think it is fine to say "the maintainers will decide what happens", which is what will basically happen in any case.

I agree.

(Personally, I think PyRoot is probably the least painful way to interact with ROOT. In terms of safety, Python is more like 12V DC where C++ is like mains power. While it makes sense to keep a core of C++ classes to do the heavy lifting, I would much prefer to move e.g. the calibration (as in "finding the correct parameters") and individual analyses to PyRoot so that the average student can get by with writing ok python rather than bad C++.)

Yeah, this is also another serious issue. Normal users shouldn't interact with C++ directly to use R3BRoot. Rather we need a scripting language that doesn't have "complicated things" like pointers or references. It should be like "click this button" style. That being said, I don't think PyRoot is a good option. ROOT team is notorious to push breaking changes to each new version. Once we upgrade our tool chains, everything breaks! Jan Mayer did use pyroot for the NeuLAND event reconstruction years ago. And guess what? It doesn't work with 6.26 version because ROOT changed library name from "libPyRoot" to "libROOTPython". A better option could be just using C++ bindings in python (either pybind or boost.python). Or we could also just stay in C++ and let normal users just edit YAML configuration file. Or we could do similar things like in game industry, which uses lua as the configuration langauge.

klenze commented 1 year ago

I think there are different usage cases outside the numbercrunching core:

My biggest pet peeve with pyroot is actually that it deletes referenced objects when all references go out of scope, which is a fine thing to do in python and a terrible thing in ROOT, where deleting e.g. TFiles, TPads, or TH1s will have observable side effects in the shadowy structures maintained by ROOT. (Also, pyroot takes advantage of the TObject reflection system. In a post-pyroot world, we could get rid of the ClassDefOverride and ClassImp statements, and generally stop inheriting from TObject for anything except objects which we want to serialize to *.root. Also not a high priority for me, though.)

I have used ctypes (which seems horribly unsuitable), but not pybind. If one can just tell pybind: "Process these headers. Now process these shared objects and match the symbols." then this might be doable, but will probably take some time.

jose-luis-rs commented 1 year ago

Hm... a week perhaps?

Minimally a week is fine. But should we have different time requirements for different types of PR? Some PRs are just bug fixes, which only have few lines of changes. Some PRs are new features, which could have new classes or old classes rewritten. Some PRs could have a huge update, which contains thousands of lines of changes. Even for the PRs relating to new features. There could be different level of significances. Some new features are added to the detector specific directory, which doesn't impact on the people using other detector directories. Some new features are added to the common directories like r3bbase, which is shared by everyone. Should we impose one week on all those PRs?

1 week is fine for now, but the beam time is approaching and the code that is needed for the beam time has to be available as soon as possible.

I don't think that for the speedy review the reviewer needs deep understanding of the detector system. Any experienced software designer should be able to check these points.

Reviewing won't take much time. But if the submitter doesn't have the experience/knowledge to change code to the way required. Then it would be necessary for the reviewer to help submitter to do that. In this case, that could be time consuming for the reviewer.

Then it falls to the people who need that feature to maintain it. If nobody wants to do that, then by revealed preference nobody needs the feature.

In this case, we could agree the PR stays unmerged until the necessary requirements are attended by the submitter.

If reverting a feature A break feature B, then the sponsors of B can either adopt A or remove B's dependence on A. If they do neither, then B gets reverted for lack of sponsors.

What if feature B and feature C both relies on feature A. The sponsor of feature B doesn't like feature A (e.g. found a problem) and he wishes to revert feature A and B. But on the other hand, the sponsor of feature C, which is also depending on feature A, doesn't want to be reverted. In such case, should we prefer the reversion or not?

But generally I think the idea of sponsor is kind of nice. The main problem, of course, is still lack of manpower and required skills. Of course, skills can be learned and trained. But I'm afraid we are also lack of people who are even willing to learn what is good and what is bad in software design.

akelic commented 1 year ago

Dear @YanzhaoW , with all due respect, and I hope you know that I have really great respect for all your knowledge and your work, but I am really unhappy with something you wrote.

I am citing you here: " But I'm afraid we are also lack of people who are even willing to learn what is good and what is bad in software design."

Well, I can easily turn your words around and say that I am afraid that we lack programming experts who are willing to sit down and learn how each and every detector, each and every electronics readout is really working. But of course, programming experts will never ever do it. What you forget is that R3BRoot is existing thanks to all us non-expert developper, who according to you are not willing to learn. Without us, R3BRoot would be just some analysis tool to analyse Califa and NeuLand data and nothing more.

I didn't want to get involved into this discussion. According to German law, I have some 15 years more to work until my retirement. And if at some point during this time I finally get totally fed up with the road the R3BRoot is presently on, well I will just stop with analysis and do something else. But, what you wrote made me really angry. People are developping detectors, testing them, working on electronics, adjusting or creating new analysis tasks, performing experiments, analysing data. And one should not forget, that many of these people are students with limited time. So, we - non-expert developpers, are doing our best to bring the R3B project forward, but because we are not on higher than high progarmming level you judge proper, we get critized???? If you are not happy with the state of the code, sit down, learn how every component of the setup is working, and make a code which is not only fulfilling all your programming design wishes, but will also (and more importantly) be physically meaningfull and correct. And then we, other non-willing to learn developpers, will become users of the code, and not insult you experts any more with our lack of knowledge of high-level programming.

I like to stay optimistic, so I will hope that what you wrote was just an unfortunate and wrong formulation of what you really wanted to say. Otherwise, I would be extremly disappointed.

jose-luis-rs commented 1 year ago

Dear @akelic

I have updated your message with the right user name at the beginning because the sentence was written by @YanzhaoW.

Best wishes

akelic commented 1 year ago

Dear @jose-luis-rs ,

thank you for the update! Sorry that I misinterpreted the message as coming from you!

YanzhaoW commented 1 year ago

Dear @akelic,

I'm very sorry that my words were offending to you. The sentence I wrote:

" But I'm afraid we are also lack of people who are even willing to learn what is good and what is bad in software design."

certainly doesn't mean I believe this is the real case. Of course, there are a lot of great people working in our collaboration. What I actually want to say is I'm not sure whether we have enough of people who, in their limited time, could see the importance and necessity of a good software design, are willing to learn about it and try to apply it in the real production.

The reason why I have this kind of doubt is because during a discussion in our meeting, there was an argument saying "We are in a physics collaboration and our primary is to generate physics." I don't know how many people have similar idea. But if that's what most of people actually believe (at least nobody objected during the meeting), then obviously, in case of the limited time, we don't need to care about the quality of software and what is good or bad because it's secondary. I guess same goes for hardware. I actually saw it in a physics institute where there aren't any people who are specifically working for the DAQ system. Someone implemented a data filtering in the firmware level which can substantially reduce the junk data. But nobody uses it. They either don't know or just don't care because they are physics students who have limited time and they think it's totally fine if they don't know it.

The second reason is because most of the issues about the time of PR can be solved by Git. I said during the meeting "Please don't urge the maintainer or submitter to merge the PR. If you want to use the new feature, please don't wait, use git fetch and git rebase and you can apply it now!" The response is "no, we are physicist and using git is not our expertise." (And learning git, I assume, is also secondary). These are just two or three git commands. If people aren't even willing to learn this, I seriously doubt they are willing to learn more complicated things, like software design and what is the best practice.

I am more into the bottom-up strategy. For me, everything along the toolchains is equally important, not just the result in the end. So to answer that question, I'm absolutely willing to apply the best practice if I'm dealing with hardware or detectors. Tell me what is the best way to connect a wire, to manage the cable or mount a card, etc, I will try my best to do it without any second thought. If I don't have time to do that, that's my problem.

But, what you wrote made me really angry. People are developping detectors, testing them, working on electronics, adjusting or creating new analysis tasks, performing experiments, analysing data. And one should not forget, that many of these people are students with limited time. So, we - non-expert developpers, are doing our best to bring the R3B project forward, but because we are not on higher than high progarmming level you judge proper, we get critized????

Please don't get angry because there was absolutely no criticism. I was just expressing my doubts and little frustration. What made me a little bit frustrated is that I explicitly said in the meeting"Hey, if you don't know how to solve this, let's have a meeting again and I can help you fix that". The answer is "No, it's secondary and just let it pass."

So this is my lengthy response, which I hope won't offend some people further. If so, please forgive me since that's certainly not my attention. In the end, I need to clarify that I'm NOT criticising the people with the idea "We are in a physics collaboration and our primary is to generate physics". This is just another way, different than what I would like. Maybe this is better or worse, which is open to another discussion other time.