Closed bfreund closed 9 years ago
Hi Benedikt,
maybe Simon has comments on 1 and 2.
I'll say that 3 is "interesting". Normally for your settings, it took 16-17 seconds to fill the buffer to 80%. This triggers the r/o in PixTestXray::doPhRun() through processData(0), there calling fApi->daqGetEventBuffer().
However, when you get the error 3, you have 65 seconds of running time (x:37.22 - x.38.27). Something went wrong somewhere there. It's not clear to me that this is in the software.
If you have this problem reproducibly, then my first proposal would be to change the "80" in
https://github.com/psi46/pxar/blob/master/tests/PixTestXray.cc#L386
to a smaller value. However, I would also guess that this might not be sufficient.
Can you please observe pXar's memory footprint in another shell for this case? (Maybe the loop in doPhRun() is taking an enormous amout of time to do one iteration because of swapping? I do not consider this likely, but what do I know ...)
My primary question is whether 3 is a reproducible problem. If yes, whether there are any specific environmental issues that might be related.
Cheers, --U.
Hi Urs,
for identical settings error 3 does not always appear, but it occurs randomly (so I would call it not reproducible). Atm I see no environmental issues.
Thanks for you suggestions, I will check them next week and give an update.
Cheers, Benedikt
Hi Benedikt,
another possible explanation would be a drastically varying flux of your xray source (e.g. at some point in time, maybe because of a compliance protection or so, the flux drops very low [as initially you have to readout every 16 seconds] until ~75% or so of the buffer is filled, is reached and then a jump back to high-rate that leads to the buffer filling up to >90% which triggers the message you have observed).
This is only speculation, of course, and may not be easy to diagnose.
Cheers, --U.
On Fri, Feb 27, 2015 at 3:27 PM, bfreund notifications@github.com wrote:
Hi Urs,
for identical settings error 3 does not always appear, but it occurs randomly (so I would call it not reproducible). Atm I see no environmental issues.
Thanks for you suggestions, I will check them next week and give an update.
Cheers, Benedikt
— Reply to this email directly or view it on GitHub https://github.com/psi46/pxar/issues/301#issuecomment-76402070.
In my reading and without further inspection all the issues seem to come from an unstable deserializing of the data and/or unclean signals.
@bfreund to answer your questions: errors are never supposed to be ignored, otherwise they would not be errors but some info printout.
The buffer filling looks like the DESER160 is missing the token out signal from the detector. It then just keeps on writing data to the buffer which fills up in a matter of milliseconds. This points again to a noisy environment and unclean signals.
Please use a scope to check both the differential sdata signal (incoming data, cause for the decoding errors) and the digital tout signal (token out, cause for buffer overflow) and take measures to increase signal quality.
Scope traces attached here might be useful.
Sorry, just re-read. You are running a full module, so forget about the token out signal but make sure to have a clean 400MHz sdata signal.
Hi Benedikt,
do you have any news on this?
Cheers, --U.
Hi Urs,
unfortunately I had no time to check it last week. Atm I am at the DPG conference but I will be back in Karlsruhe next week. Then I will check it and give an update asap.
Sorry for the delay.
Cheers, Benedikt
Short Update:
Thomas checked the signal with an oscilloscope. In our opinion we do not see a strange signal.
We still have to check the pXar memory footprint while making an X-ray test, but we have an annoying problem with our setup which need to be solved beforehand...
Hello, I did observe the same problems pointed out by Benedikt in points 1 and 2 in both data taking with X-rays (spectra measurements and high rate test) and PHoptimization. For some modules a different set of TBM parameters provided a solution, but this is not always the case. Do you have any suggestion? Or a method to find a good set of TBM parameters? My setup was: FW 4.1 and 4.1.1 (tried both out) pXar from 19.05.2015 full module with psi46V2.1respin TBM08b.
Cheers Matteo
Dear @bfreund, this issue should now be resolved with FW4.4 and pxarCore 2.5. Please check this and open a new ticket if the problem persists.
Hi everybody,
during X-ray tests (PhRun) I - sometimes - observe errors & warnings. (Some screenshots of log files showing the corresponding errors/warning are attached and marked red)
1 - Event mismatch / token chain length e.g. ERROR: <datapipe.cc/CheckEventID:L127> Event ID mismatch: local ID (132) != TBM ID (0) ERROR: <datapipe.cc/DecodeDeser400:L292> Number of ROCs (0) != Token Chain Length (8) ERROR: <datapipe.cc/CheckEventID:L127> Event ID mismatch: local ID (1) != TBM ID (130)
I am not sure if these errors have an effect on the measurement or if I should simply ignore them. In any case the Xray test finishs correctly and gives correct root files.
2 - Warning after "Event mismatch / token chain length" e.g. ERROR: <datapipe.cc/DecodeDeser400:L292> Number of ROCs (2) != Token Chain Length (8) ERROR: <datapipe.cc/CheckEventID:L127> Event ID mismatch: local ID (124) != TBM ID (91) WARNING: ROC 7: Readback start marker after 15 readouts! WARNING: ROC 0: Readback start marker after 17 readouts! WARNING: ROC 2: Readback start marker after 5 readouts!
Sometimes there are warnings after "Event mismatch / token chain length" error messages. Here I am also not sure how do deal with the errors/warning. As in "1" the Xray test finishs correctly and gives correct root files.
3 - DAQ buffer overflow e.g. WARNING: DAQ buffer about to overflow!
After this warning appears the Xray test stops, even if the measurement time is not over. This may get uncomfortable for measurements with a long duration. --> the test finishs to early but it creates correct (non-corrupted) root files.
Here are the screenshots: (1-3 show the entire log file of a certain run - 4 shows only the section of the "Readback Warning"
1
2
3
4
Can somebody tell me how to deal with those messages? Simply ignore all of them? ;-) If you need more information please ask me.