JeffersonLab / hcana

Hall C++ Analyzer
7 stars 118 forks source link

Two small things #512

Closed RichardKCollins closed 2 months ago

RichardKCollins commented 9 months ago

I was reviewing CERNRoot recently, then today going over gluons on the Internet and found GlueX. I wanted to see what kind of data you were producing and using so this HCANA looked interesting.

In your description, it says "hcana will be the package used to analyze Hall C date in the 12 GeV era." but I think that should say Hall C data.

The Link to "For an official version of PODD, see the ROOT/C++ Analyzer for Hall A page." is now historical background apparently and that linked page says " These pages are no longer maintained. Please see the Redmine Wiki for the current version." which goes to https://redmine.jlab.org/projects/podd/wiki

I know you want to be compatible and interchangeable with CERNRoot (don't just say Root, too many, quite a mess. Be explicit.) But CERNRoot has a lot of embedded limitations, particularly on precision and low energies. And it is somewhat heavy because it has so many pieces just accumulated and not checked globally. I have reviewed most of the binary formats used by large groups on the Internet. My concern for the Internet Foundation, is these quickly become proprietary and inaccessible to the roughly 5 billion humans using the Internet in various ways. It is not hard to make open, globally accessible data from any sensors or continuous data systems, whatever their purpose. CERN throws away most of their data not supporting their big goals, and that "trash data" to them contains many useful streams that others cannot afford as a direct effort.

Those are my private opinions.  I would like to see the raw GlueX data over time (days and months) in summary forms, particularly looking at transverse signals.  I do not know your local ways of saying things.  But I will read the papers, and the software projects.  I studied gravitational radiation detection at UMD College Park and worked on the NASA gravitational potential models, so I am curious is GlueX can "see" variations in the gravitational potential and its time dependent gradients.  If one has the whole model of a project, then an accelerator project can be used as a gravitational sensor with LOTS of data and very few free parameters to estimate and follow. Perhaps.  It depends on how accessible the data, its precision and how open the processing system that produces  and shares it. Sometimes all that is needed is to change a few sensors, ADCs and FPGAs.

If CERN seduces you into its "ever bigger" spiral, the small things will all be ignored and wasted. So I hope your group can remain independent.

I think CERNRoot format is not the best for global sharing. There are lots of efforts at sharing things. Binary seems efficient, but when the groups make the tools to use the data their monopoly, it becomes a global stumbling block and barrier to billions, for the benefit of a few ten thousands.

Richard Collins, The Internet Foundation

hansenjo commented 2 months ago

Corrected typos and updated Podd URL. Thanks for noticing.

The "CERNRoot" format is openly documented, and its underlying code is LGPL-licensed open source. Of anything in the software world, a file format with this legal arrangement is probably the least likely of all ever to become "proprietary" or "inaccessible". CERN may even be legally required by European Union rules to make their data publicly accessible as their operations are taxpayer-funded. This is, in short, a non-issue.

hcana and Podd are not used for GlueX analysis. GlueX software can be found elsewhere.

RichardKCollins commented 2 months ago

I do not mean to offend you, I am just writing this for my own thoughts and records. Maybe you might find it interesting.

CERN ought to be a leader in sharing, going out of its way to teach the whole human species, not just a few ten thousands -- There are two ways something can be proprietary on the Internet. The whole can exist, but keys to use it are kept so only a few can access it. Or the whole can exist in a form that is so difficult to access that effectively it is only accessible to a few, therefore proprietary (accessible to only a few). I do not have a term for the Internet that captures the main feature of my criticism so well as proprietary (in the sense of only accessible by insiders or close associates). The terms, closed and open are descriptive but too blunt and ignore that the reason something is closed is usually because a group closes it, or they neglect to put sufficient effort to make it open.

I have been at this for 2/3 of my life now and I am 75. I doubt I will change anyone's way of speaking or doing things. I write a few comments here and there, and try to explain some of the issues I see, and try to explain why they are important, at least to me.

Overall when I trace sites on the Internet I can usually find ways to access the information, even if the site is badly written. When there are many deliberate locks and barriers, I simply label it "proprietary" and that could have the overtone of "too much trouble to bother". There is a little bit of information at CERN that is accessible, that is not available elsewhere, but if it takes even someone with a universal training in all subjects like me a lot of time, then pity the 2 billion children from 4 to 24. Many of whom now could apply tools to what is there, but "why bother if it is such a lot of trouble".

Check. See if ALL the logins are open to all 5.2 Billion Internet users. I think you will find the bureaucratic systems on the site are so highly biased to those who are working or enrolled in "approved higher education and government agencies" almost everyone is excluded. Even if you trace the current users to a few hundred thousand, that is NOT billions.

Much of the stuff on CERN is highly biased to the interests and passions of a few investigators and administrators. The information they are interested in, takes all priorities. The information that is stored, archived, shared is what they want, not what is available raw from all sensors. The whole goals of CERN are "bigger". The projects sold because they are mysterious and "only understandable to a few big minds who win Nobel prizes and such". The countries have to support it, because the Emperor has new clothes and not one is allowed to say it is the wrong direction.

That attitude permeates all related organizations and prevents communication and understanding, except by a few insiders. Certainly CERN has done little to help make atomic and nuclear energy accessible to all. If they were doing a good job, then they would understand those energies, and help the human species use them, not keep smashing things and showing bumps - everyone chasing a few prizes that only insiders can win.

My voice and opinions mean nothing. I no longer even bother. I am just writing this for my own notes.

What I want to see is the noise in the experiments. And the errors are all parts per thousand, with a few precise experiments where, unfortunately, the documentation is not complete or accessible.

I am working on "all languages" and "site:nih.gov" and "site:wikipedia.org" now. It will take me much of the rest of the year, and I have already invested decades. I do not consider CERN a high priority. It only does things that no one else can verify, and only a few can do that. The questions are all wrong, when you exclude billions and only allow ten thousands. Fundamentally wrong for the species. I did download CERN Root and have been examining it just as I have many hundreds of the largest "open" projects. But it is badly structured and bloated, just like most things on GitHub and similar sites. Sharing with all humans is harder but not impossible.

I wanted to see the low energy data from nano_eV to 10 MeV. And errors in every device at ppb. Most all the calibration data is locked. Not accessible without insiders help.

Filed as (CERN ought to be a leader in sharing, going out of its way to teach the whole human species, not just a few ten thousands)

Richard Collins, The Internet Foundation

On 05/05/2024 8:43 PM CDT Ole Hansen @.***> wrote:

Corrected typos and updated Podd URL. Thanks for noticing.

The "CERNRoot" format is openly documented, and its underlying code is LGPL-licensed https://github.com/root-project/root/blob/master/LICENSE open source. Of anything in the software world, a file format with this legal arrangement is probably the least likely of all ever to become "proprietary" or "inaccessible". CERN may even be legally required by European Union rules to make their data publicly accessible as their operations are taxpayer-funded. This is, in short, a non-issue.

hcana and Podd are not used for GlueX analysis. GlueX software can be found elsewhere https://github.com/JeffersonLab/halld_recon.

— Reply to this email directly, view it on GitHub https://github.com/JeffersonLab/hcana/issues/512#issuecomment-2095062375, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGC622FUSE3KTQ4FJESYDGDZA3N27AVCNFSM6AAAAAA6AVOVN2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOJVGA3DEMZXGU. You are receiving this because you authored the thread.Message ID: @.***>