R3BRootGroup / R3BRoot

Framework for Simulations and Data Analysis of R3B Experiment
https://github.com/R3BRootGroup/R3BRoot/wiki
GNU General Public License v3.0
18 stars 102 forks source link

Introducing R3Brc #800

Open klenze opened 1 year ago

klenze commented 1 year ago

Amore et studio elucidandae veritatis haec subscripta disputabuntur

When in the Course of human events, it becomes necessary for one

"Evaporative Cooling of Group Beliefs" describes how group consensus is shifted and reinforced when dissenting members leave

"Philipp is forking R3BRoot." -- "Who's Philipp?"

The backstory:

On 2023-02-10, @jose-luis-rs merged #771 (previously #676, #679) which is a rewrite of the CALIFA Clustering, claiming "there are many collaborators waiting for this new version".

Previously, two people had commented on that PR on github.

@ryotani, who in the end concluded

Basically, I have no objection to this PR.

@klenze (that is me), who wrote

Instead, your code is a complete rewrite which adds significant complexity (e.g. regarding loop invariants) without providing a comparable benefit in performance (beyond the start-cluster threshold which could be added in my code in two lines). If fact, I suspect it to be slower (after adding the threshold break to mine, again). I have not yet heard a convincing argument why we should think about selecting one energy measurement out of the possible four (two gains, amplitude vs tot) at the same time we are in the process of clustering instead of picking the best energy for each crystal beforehand. [...] I fail to see the improvement of your version of Exec over mine.

Seeing how the original code was still quite not optimal from a readability point of view, I even wrote a counterproposal. (It should be noted that the interface of all three versions is mutually incompatible to some degree, so there was an incentive to minimize the number of version switches to avoid annoying users.)

So it would appear that the conclusions of the reviews are a lukewarm endorsement and one rejection.

If a maintainer makes the decision to merge in that situation, they better be sure it is the right decision.

Does the new clustering work?

Out of spite, I have since done some testing of the newly blessed official clustering.

Contrary to my expectations, the performance is not terrible. Goes to show how hard it is to judge real life performance without measurement. Might be different if we ever have really high multiplicities.

Today Yesterday I did some unit testing on the clustering method, trying to find subtle biases. Initially I failed because another bug caused the hits from both ranges to be added for some crystals. After adding more epicycles to fix that, I can confirm that the subtle biased behavior is present (although not exactly where I searched for it). So the code smell I picked up initially was correct and it turns out that long, complex code can hide bugs. I am Jack's complete lack of surprise. (Don't worry, I am sure they can be fixed, to paraphrase RFC1925, you can make an anvil fly if you add a sufficient amount of helium baloons.)

Introducing R3Brc

I am not really angry at @gabrigarjim for making the PR. This is why we have the PR mechanism in place, after all.

And if the balance of reviews was different, if say @bl0x, @inkdot7 or @YanzhaoW had enthusiastically endorsed the PR, I would not blame our maintainers for merging it either. Unfortunately, that is not the case. I am not privy to the discussions of the maintainers and can not say if this was the decision of @jose-luis-rs alone or not, and I don't really care either way. The gist is that I have no confidence in the technical judgement of our maintainers.

I have spent a lot of time arguing against #771. In the past months, I have also spent a lot of time battling the forces of entropy in other areas of R3BRoot, lobbying for dynamic_cast (with success -- which resulted in two instances of invalid casts getting fixed) and trying to teach people to avoid undefined behavior. I spent last weekend writing my counterproposal for R3BCalifaCrystalCal2Cluster. I spent half of this weekend testing the new implementation. Why should I go on reviewing PRs if there is no correlation between reviews and merges?

R3Brc is a hard fork of R3BRoot as permitted by Section 5 of the GNU GPL 3. For most detectors, I will aim to track changes in R3BRoot. For detectors I care about (mostly CALIFA), I plan to successively rewrite the parts I find more distasteful. This includes calibration parameters storage, histogramming and calibration generation. But first I will have to finish filing for unemployment benefits and write up my thesis, so that may take a while.

You can add support for my repository by running

  cd R3BRoot # source dir
  git remote add R3Brc https://github.com/R3BRCDevs/R3Brc
  git fetch R3Brc
  git checkout -b r3brc R3Brc/r3brc

In the future, I will not offer advice on working with mainline R3BRoot. If you want me to review your PR, make a PR at R3Brc. It seems that my common cluster compatible wrapper for was recently transplanted into dev. I have no problem with that, after all, rms teaches us to not restrict the freedom to run code even if it is for a purpose we do not approve of, such as running a broken clustering algorithm. (I would like to point out the existence of git cherry-pick, which can be used to do such transplantations while retaining the original authorship of the commit instead of having to move the attribution to the commit message.) Feel free to contact me with any troubles or feature requests after running the above three lines.

Also note that R3Brc contains my new version of clustering, and while Leyla and I did some testing of it (and my estimate is that the shorter code will contain fewer errors than the current mainline) I do not consider it production ready yet, so consider sticking to 5aa4deafa9df for analysis work.

Long term R3Brc agenda:

So long, Philipp

Reference: Previous version Gabriels version My counterproposal (draft) (This was not merge ready. I forgot to call AddPositionInfo in FilterCrystals, so it does not work. When that is added, it seems to work okay.)

YanzhaoW commented 1 year ago

"there are many collaborators waiting for this new version"

There is a standard way to do this, which, I would guess, is not known by most of our collaborators. Instead of waiting for this being merged to our dev branch, anyone who needs the new version could add the remote of the forked repo and directly pull the new version from there, like:

git remote add NewRemote https://github.com/the_forked_version.git
git fetch NewRemote
git checkout NewRemote/BranchName

My general thought

In a collaboration, we always need to (or have to) work with people, who have different skills, understandings and philosophies. For most of the time, we need to agree to disagree. And most importantly communication and patience should always be preferred.

About R3Brc

To be honest, I don't think this is a good idea. If there is any contributions we wish to make for R3BRoot, R3BRootGroup/R3BRoot is the best place, for the simple reason that it's used by most of us and will be used by most new colleagues in the future. I know it's quite hard to make everything in R3BRoot optimal over weeks ( or months). But things are still improving (like dev branch by default, adding clang-tidy or upgrading to a new C++ standard.) Breaking developments into separate unmergeable branches is never good, which has been already proven by the code development on TofD (by Michael?). So let's be patient and discuss about what should be done one by one.

About PRs in R3BRoot

As for many of your points, I share a similar philosophy and agree that what is not good enough should not be merged hastily. Whenever someone has doubts, let's wait, have a discussion and fix the issues until everyone agrees (to disagree). And for those who urgently need the new version, nobody stops them to pull the change from the forked repo.

For a better conversation among the people developing R3BRoot, I would sincerely suggest that we should have a weekly discussion of all PRs from the previous week(@hapol @jose-luis-rs @Valerri).

Anyone, who wishes for his/her PR to be merged to the dev branch should explain the changed code "line by line" and the PR should not be merged until everyone in the discussion has no objection. And people who make PRs with lots of changes, such as #771, #728 or #764 should make a presentation in the analysis meeting before the PR is merged.

Frankly speaking, I didn't even know there was a hot discussion about PR #771. :) So with the suggestion above, I hope we could make our development more robust.

michael-heil commented 1 year ago

Could you please explain me what do you mean by this statement: "Breaking developments into separate unmergeable branches is never good, which has been already proven by the code development on TofD (by Michael?). " ?

hapol commented 1 year ago

@YanzhaoW I agree with most of your sentences; I do not think that PR's to be merged should be explained "line by line"; they should be exposed publicly here, where all of us can check, comment and discuss (probably this is what you meant with your sentence). With the exception of parameters, which should be presented in the analysis meetings to evaluate the procedure and quality of the new parameters that many people will use... this is not the case right now and I would ask everybody producing parameters for the different detectors to present their results in the analysis meetings. Anyway, for large code modifications or change concepts, it would be nice to have these presentations (I think Gabriel already planned to present #771 i in the next meeting).

The problem with "until everyone in the discussion has no objection" is that it could last forever if someone blocks developments... In the case of #771 is the second reopening of the PR with several tens of comments, corrections and modifications to make it better. The PR is fully functional and improves existing code. No ones claim it is perfect, probably, and one developer has still concerns. The release coordinator decides to go on with the PR and ask for ulterior modifications. It sounds reasonable to me.

ajedele commented 1 year ago

My 2 cents here may be irrelevant, but I would like to share it anyways: First, I find it unfortunate that there is such disagreement amongst this group. The people who commented here are all in the business of improving R3BRoot, whether on the detector calibration, simulation or the framework development side. Given how limited the resources are, it would be a shame to continue diverging.

I would propose the following: 1.) A timeline should be proposed for PR merges. I would advocate for PR merges to be open for approx. a week. Every Monday, a list of PR requests could be generated and if there is no issues with the request, they will be merged on Friday. I can put together that list if you would like. If there is a conflict, it should be flagged. At that point, we should decide what the issues are. If it's 'you didn't update your version of R3BRoot 1st and therefore you are changing 300 files', then that is a temporary hold. The PR can, from a functionality side, be approved, but merged subject to above changes. If the review is an issue with the code itself, it should be put on hold. Discussion can start here. If there is no resolve, I would suggest either bringing this to an analysis group meeting or having a separate zoom meeting. Everyone who joined the discussion should have the right to join that meeting. 2.) I don't think it's a bad idea for Philipp to have a fork UNDER CERTAIN CONDITIONS! Clearly, Philipp would like to make large scale changes to R3BRoot, ones that would overhaul the code quite a bit. We should give Philipp a playground to explore this. I would then suggest that every 1-2 months, we convene and discuss these changes. If Philipp can prove his changes improve the code, have been thoroughly tested and are user-implementable, we should consider adding them. If they are nonsensical, they can be rejected.

I am open to how this playground for Philipp should look like. I think it also makes sense to start this discussion now given Philipp will be finishing shortly (hopefully) and then starting a post-doc position with Tom.

As for the Tofd, it is like many things in our collaboration: not straight-forward. I think Michael would agree that his coding abilities are not 'weltbewegend' (earth shattering). His code is functional, but cumbersome and overly complex. However, I would not necessarily expect a meticulously, brilliant code from him. His expertise is detector development and electronics design/support (missing the correct phrasing here). I hold him to a high standard there. Jose Luis's expertise is on the software side. He took Michael's code and added a very nice infrastructure to it. The code is much less bulky and more maneuverable. However, in the process of rewriting the code, several issues arose such as incorrect addition/subtraction and multiplication/division signs, segments that were non-functional and other smaller issues. Currently, Enis is working on rewriting this code. He is taking the knowledge of the functionality of the Tofd from Michael and what, therefore, the code should reflect and the infrastructure that Jose Luis build and merging them together. By working together with both experts, the code will be optimized.

jose-luis-rs commented 1 year ago

Dear @klenze, the CALIFA tasks are working well and the users should run it for the different experiments in order to provide more improvements than those discussed in the PR #771, for instance, the use of new parameter containers for the cluster algorithms, ...

After some tests for https://github.com/R3BRootGroup/R3BRoot/issues/755 the initialization of R3BRoot has increased a factor 30, or more, if we run the code for a full experiment. The execution is OK.

Additionally, the PRs have to be fully tested to avoid problems like #794 and #796, of course, those mistakes were there before your modifications, but #794 and #796 clearly indicate that you never ran the code before doing the PR #759. This PR has modifications in 382 files, which makes very difficult its review. As commented by @YanzhaoW, could be nice to explain PRs with a lot of changes during our meetings, but with 382 modifications there is no way!

Issue https://github.com/R3BRootGroup/R3BRoot/issues/705 was closed, why? This was not tested yet.

R3Brc, you can do what you want, but before starting the experiments you must do the corresponding PRs to the main R3BRoot repository. I will not do it for you since we don't have manpower. Moreover, developments like this here without discussion make no sense. Please, do a PR as in #771.

About: "Breaking developments into separate unmergeable branches is never good, which has been already proven by the code development on TofD (by Michael?)." @YanzhaoW, they are doing the analysis of the S494 experiment in a specific branch, but this is no a big problem because their modifications are under github and we can always copy&paste the needed improvements (algorithms to calibrate a specific detector) into our main branch. The real breaking is @klenze's repository because according to his ideas he wants to change the basis of FairRoot/R3BRoot, doing the branches/codes completely unmergeable. I think that the main consequence of this kind of decision is that his repository won't be useful to analyze data from a full experiment, just for specific analysis mostly related to CALIFA. Although the new repository could also fail for CALIFA if we take into account the new developments for CEPA, which will never be available in R3Brc due to the lack of manpower. At least, this is my opinion.

Taking into account "the lack of manpower", we need the contributions from our students, like #593, #635, #642, #707, #766, #768, #791, ..., even if the PRs are not perfect, otherwise, there is no way to continue with the developments and code testings.

jose-luis-rs commented 1 year ago

"Jose Luis's expertise is on the software side. He took Michael's code and added a very nice infrastructure to it. The code is much less bulky and more maneuverable. However, in the process of rewriting the code, several issues arose such as incorrect addition/subtraction and multiplication/division signs, segments that were non-functional and other smaller issues."

This is because the implementation was done in two weeks (including los, fibers ... as well), but the code worked for the S522 and S509 experiments and allowed to obtain correlations with other detectors, ion tracking, .... After Enis' improvements, we should have a detector with a complete functionality soon. Then we also have to see if it works for the analysis of the experiments performed in 2021.

YanzhaoW commented 1 year ago

Hi, @michael-heil I don't have any ill intent to your work. I'm very sorry if my comment makes you feel that way. Your contribution to our collaboration is absolute essential. I just heard that in order to incorporate your code into our main repo, someone (jose-luis?) need to migrate your code "in a hard way" rather than some fetch and rebase commands. In my personal opinion, breaking developments into two different unmergable branches/repos should be avoided if possible, since, as many people said, we may not have the manpower to double check if everything is still compatible after the migration.

@hapol Yes, you are right. Everything I said is just suggestion and up to discussion. But I would still hold my opinion that a PR should not be merged before being presented in the discussion or meeting. In the analysis meeting, the person who makes the PR could introduce the principle, the functionality, how to use the new code and what are the results. But there should be another place where the presenter needs to explain how he writes his code and what's the logic behind it, and where we can discuss whether the code design makes it easy for other people to work with. The latter part, in my opinion, is what lacks during the analysis meeting.

klenze commented 1 year ago

Sorry, I took Sunday off.

@hapol

The PR is fully functional and improves existing code.

Agree to disagree on both counts.

With regard to the code being "fully functional", I have created a quick unit test for the Exec function here. (Because I did not want to spend 100 LoCs setting up FairRoot, I broke encapsulation and directly assigned the TClonesArray member pointers directly. @YanzhaoW will probably not like this. Note that if you want to run the script without proper compilation, it may be easier to make the members public in the header and delete the include.)

califa_unit.C.txt

The output I get from running it contains these lines:

Preparing test case input TClonesArray
   id=2000  en=620keV
   id=2001  en=199keV
   id=4433  en=201keV
Output clusters generated by Exec:   en=1020  crystals=[ 2000 2001 4433 ]

For the non-CALIFA people: Channel 2001 and 4433 (=2001+2432, the latter of which is our number of crystals) refer to the same channel read out with different gain settings (which is why I assigned them similar calibrated energies). For clustering, it would be customary to pick a value with one of the gain settings. Instead, here both gain settings are added to the cluster.

I added some debugging and found the issue. The more subtle biases I mentioned exist as well, I can show them after @gabrigarjim fixes the double-counting bug.

and improves existing code.

Improves by what metric? Lines of code? Percentage of califa code written by USC?

So far, exactly nobody has told me "Philipp, your clustering code is terrible because _____". So please tell me. Is it wrong? Too slow? Convoluted? Cryptic?

From my point of view, #771 does a few things different from the previous version.

For crying out loud, his Exec method is 10.9 kilobytes (300 lines) long. That is literally longer than my counterproposal cxx file (8kiB) with copyright header, Init, SetParWhatever and so on. For reference, my longest method, DoClustering, is 1.3kB including comments and logging.

If I made a PR tomorrow which added a COBOL interpreter to parse parameters specified in COBOL into R3BRoot, a good response would not be to attack any particular weakness of the code, but to tell me "Philipp, adding a COBOL interpreter is a terrible idea and we will not do that". This is roughly how I feel about 771. (Of course, in my experience, PRs which are considered unfavorable by the maintainers are simply left open and uncommented.)

I know how it feels to see code written by me being improved on, and this is not it. It would be debatable if this should be considered an improvement over the 2012 clustering.

@YanzhaoW: Agreeing to disagree is nice when it is a minor issue, but this is not a minor issue. This feels like a pretty fundamental disagreement, like if two sailors in a storm disagree if they should pump water into the ship or out of it. I consider my part in the califa clustering (which is the only nontrivial step in califa analysis processing, really) my most valuable contribution to the codebase. No amount of preaching about dynamic_cast and initializing variables can undo the effects of losing that battle. (And the battle is lost, even my mildly controversial PRs like #709 (which would assert that people use a working unpacker in an easily overridden way) do not get merged, so any PR which reverts the clustering now clearly favored by USC does not have a snowballs chance in hell.) So instead of struggling for the bilge pump I make a run for the life boat.

gabrigarjim commented 1 year ago

Hello,

For the non-CALIFA people: Channel 2001 and 4433 (=2001+2432, the latter of which is our number of crystals) refer to the same channel read out with different gain settings (which is why I assigned them similar calibrated energies). For clustering, it would be customary to pick a value with one of the gain settings. Instead, here both gain settings are added to the cluster.

I added some debugging and found the issue. The more subtle biases I mentioned exist as well, I can show them after @gabrigarjim fixes the double-counting bug.

I've found the bug. I misspelled "gamma" for "gammas" at one line, so I will fix it in a quick PR, with that single "s" removed. Thanks for checking that. Could you please tell us about the more subtle biases you found?

So far, exactly nobody has told me "Philipp, your clustering code is terrible because _____". So please tell me. Is it wrong? Too slow? Convoluted? Cryptic?

It was not terrible. Well, it was incomplete and impossible to understand, specially for newcomers. It was also slower with low thresholds. I asked for some new functionalities (the list, the mother ID, the randomization inside, the second threshold). I needed that for the analysis and so many students. No answer. I tried to did it myself, and end with the algorithm I needed, so I decided to upload it for the rest, and now you are angry because the judeo-masonic-comunist USC wants more lines of code (for what, more dessert at the canteen?). If you have more complains I'll kindly ask the devs to switch back to your version, as long as you include the things we need. Meanwhile I'll try to finish my thesis.

ajedele commented 1 year ago

As a collaboration and GitHub users, we follow a code on conduct. Please stay within these bounds. https://docs.github.com/en/site-policy/github-terms/github-community-code-of-conduct

hapol commented 1 year ago

Thank you, @ajedele

klenze commented 1 year ago

The bias I mean is the following. Consider a double range channel where one channel is slightly above the gamma threshold, while the other is slightly below it (due to random noise). Your implementation will systematically take the gain range with the value above the gamma cluster threshold, while mine will first decide on a preferred gain range based on the DR threshold and then test the selected gain setting against whatever threshold applies. This may or may not make a difference in practise, I don't know.

With regard to the "gamma"/"gammas" thing, I would do the test "which range is that crystal in" directly in addCrystal2Cluster. My preference for shorter code over longer code is not just some idiosyncratic aesthetic ideal, there are also practical considerations. If one has many calls to a method, it is much harder for a human to spot these tiny differences. I certainly did not catch it when reading the code.

I think I will keep working on my version, implementing your features. Roman had some further ideas on how clustering could generally be improved. If I have a version which will offer some benefit over the present state feature-wise, then I will do a PR. Does this sound acceptable?

Gabriel, I am sorry I came across as attacking you. Both the clustering and code quality are things which are important to me, and I tend to get a bit defensive. I think I should also work on finishing my thesis.

gabrigarjim commented 1 year ago

Hi @klenze . I don't know if the case you spot here could affect the quality of the gamma clusters that are formed. This we can check and decide. I also agree that long code could be difficult to understand, although I prefer it over too deep encapsulations. A middle point may be the ideal scenario.

I think I will keep working on my version, implementing your features. Roman had some further ideas on how clustering could generally be improved. If I have a version which will offer some benefit over the present state feature-wise, then I will do a PR. Does this sound acceptable?

That sounds totally fine (and will be warmly welcomed) for me. I did not made this PR because I think my code is the best, I only wanted to have the tools we needed.

I am not angry at all, I just wanted to clarify things and continue with other issues. Should I wait for your PR? If it is going to be launched soon we can just go directly to your version.

ryotani commented 1 year ago

@klenze @gabrigarjim I hope the discussion here in general will not spoil your important time for finalising your work, but I feel the discussion here is very beneficial for all the current and potential collaborators who use CALIFA. Although I am not in the CALIFA WG, I will be happy to read and comment on the codes of both of you as I have been also working on different scintillator arrays.

My personal understanding of the code development for such kind of work is first, making it work for certain aims of the initial developer. I would admit that I felt also the code in the previous PR by @gabrigarjim looked a bit too complicated, but I could confirm it should work as we expect. Thus I mentioned,

Basically, I have no objection to this PR.

Well, we are working in collaboration and if people think it needs improvement, then others can do it as usual. (Sorry for making you confused with tens of comments for your code the other day...)

I also see @klenze working on a different repository now and can see it looks like simpler codes. I have not looked into it carefully yet, but I would find time to do that... I am not very promotive to have a completely different repository if it stays permanently, but it should be no problem for a time being. As an alternative solution, you can define another function as Exec_Phillip() (and Exec_Gabriel()). Then, other contributors can decide which one to use. Sharing as many variables as possible between two processes, it is even easier to compare two methods. Such treatment is already happening in several places as twim/calibration/R3BTwimCal2Hit.cxx for instance.

In closing this message, I would like to emphasise my personal opinion that this repository should have a welcoming atmosphere for everyone wishing to enhance/fix the analysis and simulation tasks. It would be a burden for maintainers to read lengthy codes, but I would tend to encourage everyone to make pull requests and issues even if it looks not so fancy.

hapol commented 1 year ago

Thank you @ryotani for your comments. I fully agree that we should encourage everyone, mainly young collaborators to make pull request, issues and bug reports, and we all should try to carefully and positively comment and support their contributions. Lengthy technical discussions on secondary implementation details and request for implementation of ideal functions or replacement of stablished methods for accessing data is not a suitable recommendation for a young contributor.

I would add, maybe in disagreement with your comments, that short code is usually desirable, but legibility is also really important: I have received, as coordinator, complains recently like "don’t like that the people who are actually using the code and work with the detectors can not read the code anymore". So, I would ask limiting the use of fancy features which difficult the code legibility and prevent the access to less advanced developers.