JeffersonLab / analyzer

HallA C++ Analyzer
BSD 3-Clause "New" or "Revised" License
8 stars 53 forks source link

data_type_def not necessarily defined here #151

Closed sawjlab closed 6 years ago

sawjlab commented 6 years ago

https://github.com/JeffersonLab/analyzer/blob/63ee549ae5a12e690792bdf20d18c8a0ab7f78a8/hana_decode/PipeliningModule.C#L79

sawjlab commented 6 years ago

Maybe I am wrong. Just noticed that data_type_def is declared static, so as long as the first word in the buffer has bit 31 on (which it should), data_type_def will be defined.

hansenjo commented 6 years ago

Good catch. I guess you mean "data_type_def not necessarily initialized here"?

All static variables are initialized to zero (or default-constructed) at program startup. So data_type_def will initially definitely be zero. That's not the problem.

The real question is: why is it static? By declaring it static, the value of data_type_def will (a) be persistent throughout the program lifetime, even across events!, and (b) shared among ALL instances of PipeliningModule! This is a true global, even when declared deep inside a function, like here. My gut tells me that that cannot be right. I think this variable should be defined as a local variable outside of the loop, like so

UInt_t data_type_def = 0;
for (UInt_t i=0;  i < codabuffer.size(); i++) {
   ...
   if (data_type_id == 1)
      data_type_def = (data >> 27) & 0xF;
   ...
}

In this way, every time the loop hits a word with bit 31 set, data_type_def is updated and maintains its value for all subsequent words with bit 31 clear - until exit from the function.

If the value really needs to survive between function calls, make it a class member and be sure to clear it for every event. If it needs to survive across events, still make it a class member and initialize it to zero in the constructor or in some sort of Begin() function.

--- begin lecture

As a rule of thumb, all in-class static data should be constant. If not hardcoded, then the constant value can be set during program startup, for instance by reading from a database or setting it with a Set function from a steering script, but it should NOT change afterwards, definitely not dataword-for-dataword. Anything else is advanced usage, should be liberally commented, and is most likely a design flaw.

---end lecture

Since Bob wrote this module, I'll let him speak to the intended lifetime and scope of that variable.

hansenjo commented 6 years ago

Oh - why is the codabuffer passed into SplitBuffer by value and not as a const reference? The whole buffer is copied every time this function is called ...

The same happens every time the buffer is pushed into "eventblock":

eventblock.push_back(codabuffer);

Please think of a way to avoid all this copying. Can we work with pointers instead?

rwmichaels commented 6 years ago

It probably doesn't need to be static. I'll need to study the code to be sure, and today I'm extremely busy (I'm one of two people on shift). As for why to copy the buffer each time, I think it happens only in MULTIBLOCK mode, and only when you get to a new block. Typically MULTIBLOCK=100 so it would only happen every 100 events. I need to check to make sure of that, though. But yeah, maybe it can be done better.

BTW, MULTIBLOCK mode does not work for Fastbus data yet. (relevant to SBS). The Fastbus data look ok but the decoder cannot split the blocks yet. Not a problem in principle, just some grungy programming that needs to be done for SBS.

Bob

----- Original Message ----- From: "Ole Hansen" notifications@github.com To: "JeffersonLab/analyzer" analyzer@noreply.github.com Cc: "Subscribed" subscribed@noreply.github.com Sent: Friday, January 26, 2018 12:09:24 AM Subject: Re: [JeffersonLab/analyzer] data_type_def not necessarily defined here (#151)

Oh - why is the codabuffer passed into SplitBuffer by value and not as a const reference? The whole buffer is copied every time this function is called ...

The same happens every time the buffer is pushed into "eventblock":

eventblock.push_back(codabuffer);

Please think of a way to avoid all this copying. Can we work with pointers instead?

-- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_analyzer_issues_151-23issuecomment-2D360685911&d=DwICaQ&c=lz9TcOasaINaaC3U7FbMev2lsutwpI4--09aP8Lu18s&r=18qXn_9GgntmnnlhbPFXjg&m=FQcHvPgE1yk6MpF8EZ9Lu9yAOfcmQooVmoEt2RZlrU4&s=gQzJBeuLAo-2exOcZdNW1JDWNIPzDpUCbdyfyotm-wQ&e=

sawjlab commented 6 years ago

There is no rush to do anything from the hcana point of view. The uninitialized variable is just something I noticed while using the FADC code as a guide to writing a decoder for TI data. It won't affect Hall C right now as we don't use multiblock. (Removing static from the definition of data_type_def will probably allow the compiler to flag this as a problem.)

I think Ole is right about the copying. Even in non-multiblock, codabuffer is copied twice per event. So reducing the copying will help performance, but we need to make sure not to break FADC decoding.

rwmichaels commented 6 years ago

Hall C would be doing a service to the "Podd" community to run 10 or 20 min with multiblock mode, then we can analyze the data and verify the decoding works reliably -- including the TI data and whatever other devices you have (CAEN ADCs ?) -- and also check the performance issues (copying unecessarily, etc). I had some multiblock mode data from test setups, but there's nothing like real experiment data.

The DAQ can get a good improvement from running in buffering mode. This does not affect the event structure, so it should work right away for the decoder (there may be other reasons to avoid it, though). Multiblock mode is an alternative way to get better DAQ performance. And combining them (buffering plus multiblock) should yield the ultimate performance. So, it would be good to get some data in 3 modes: 1) buffering only, 2) multiblock only, and 3) combined buffering and multiblock.

Bob

----- Original Message ----- From: "Stephen Wood" notifications@github.com To: "JeffersonLab/analyzer" analyzer@noreply.github.com Cc: "Robert Michaels" rom@jlab.org, "Comment" comment@noreply.github.com Sent: Friday, January 26, 2018 9:21:38 AM Subject: Re: [JeffersonLab/analyzer] data_type_def not necessarily defined here (#151)

There is no rush to do anything from the hcana point of view. The uninitialized variable is just something I noticed while using the FADC code as a guide to writing a decoder for TI data. It won't affect Hall C right now as we don't use multiblock. (Removing static from the definition of data_type_def will probably allow the compiler to flag this as a problem.)

I think Ole is right about the copying. Even in non-multiblock, codabuffer is copied twice per event. So reducing the copying will help performance, but we need to make sure not to break FADC decoding.

-- You are receiving this because you commented. Reply to this email directly or view it on GitHub: https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_analyzer_issues_151-23issuecomment-2D360797124&d=DwICaQ&c=lz9TcOasaINaaC3U7FbMev2lsutwpI4--09aP8Lu18s&r=18qXn_9GgntmnnlhbPFXjg&m=AkXZgp8DHEtLA9rZCy2Xz1xd7h8HISpmN90KL0tFo4o&s=nzMkZE93JyXHCB4yaQs7K7NNplh0I0Ebgz0z8MKwQ24&e=

hansenjo commented 6 years ago

Moved to Redmine issue 265