puhep / pudb

Purdue CMS FPix Database
0 stars 0 forks source link

calculation of pxar error count #189

Closed lantone closed 7 years ago

lantone commented 8 years ago

Hi,

Here I see "Number of pXar errors during testing: 0":

http://inky.physics.purdue.edu/cmsfpix//Submission_p/summary/summaryFull.php?name=M-P-1-09

But here I see a different story:

http://inky.physics.purdue.edu/cmsfpix//MoReWeb/Results/REV001/R001/M-P-1-09_FPIXTest-m20C-FNAL-160617-0924_2016-06-17_09h24m_1466173471/QualificationGroup/ModuleFPIXTest_m20_1/Logfile/LogfileView/TestResult.html

Any ideas?

jstupak commented 8 years ago

Can you check the xml file that was uploaded to the DB to confirm that it says 0 errors? That will tell us if it is FPIXUtils or PUDB who is at fault

lantone commented 8 years ago

i just reran uploadTest.py on the output, got this line in the xml:

<PXAR_ERRORS>594</PXAR_ERRORS>
jstupak commented 8 years ago

Sounds like a DB issue to me then. Any ideas Greg?

gneeser commented 8 years ago

I'll take a closer look this afternoon and let you know what I find

-Greg

gneeser commented 8 years ago

Nothing is jumping out at me as incorrect at first glance... could I get the zip file for that module?

Also, have you noticed this for other modules as well, or is this the only one?

-Greg

lantone commented 8 years ago

I've only noticed it on that one, but haven't really looked at many others with that failure mode in mind. I just remake the zip when I tested it, I've asked the current shifter to check the test stand computers to see if the original zip is still there.

lantone commented 8 years ago

ok, greg i sent you the zip file via email

lantone commented 8 years ago

or not. let's try this: M-P-1-09.zip

gneeser commented 8 years ago

Got the zip, I'll start debugging. Thanks!

gneeser commented 8 years ago

In the zip that you attached, it reads that there are zero pxar errors:

0

Could there have been an error in the script?

jstupak commented 8 years ago

Maybe this zip is for the "wrong" temperature? IIRC the zip gets overwritten for each test, so if the same module was tested again at a different temperature then we lost the original zip.

gneeser commented 8 years ago

Do you mean that the zip at FNAL gets overwritten every time the module it tested? That's definitely the case for the PUDB at least - the mysql columns are overwritten every time new fulltest data that contains that info is submitted, so if the most recent submission had 0 pxar errors, that's what would be reported.

drberry85 commented 8 years ago

So the 17C results are probably over writing the -20C results... or vice versa.

On Wed, Jun 22, 2016 at 10:07 AM, gneeser notifications@github.com wrote:

Do you mean that the zip at FNAL gets overwritten every time the module it tested? That's definitely the case for the PUDB at least - the mysql columns are overwritten every time new fulltest data that contains that info is submitted, so if the most recent submission had 0 pxar errors, that's what would be reported.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/puhep/pudb/issues/189#issuecomment-227774176, or mute the thread https://github.com/notifications/unsubscribe/AElNLMQvOEwhh4A-KYgEqbbnUJHRNV7Wks5qOU-mgaJpZM4I516O .

jstupak commented 8 years ago

Yes, but greg reminded of something. This can't be a 17C zip file, because it would not contain the PXAR_ERRORS field. So I have no idea what happened here.

gneeser commented 8 years ago

Weird. Well, the zip that was sent to me did indeed have 0 pxar errors in the xml, and the DB seems to have correctly parsed that much - assuming that is the zip that was uploaded.

jstupak commented 8 years ago

Ahhh. The zip file Jamie sent you is not the zip that was actually uploaded. From the zip:

Test results and config files can be found in: M-P-1-09_FPIXTest-m20C-FNAL-160620-1231_2016-06-20_12h31m_1466443919

This note is not shown anywhere on the DB page, so it must have been a different zip that was uploaded (that has been lost, unless it was on the other desktop).

Was this module tested multiple times at -20 and only uploaded once? Seems so

jstupak commented 8 years ago

This still does not answer the question of what actually happened though. The -20C results (and therefore PXAR_ERRORS data) that were uploaded came from the test at 09h24m, and the corresponding moreweb results show errors.

Since the lessweb code is deterministic, and Jamie says he gets 594 errors when he reruns it on the same elCom output, I would guess the shifter did something dumb to cause this (unless we see this for other modules as well).

drberry85 commented 8 years ago

There are two sets of -20C results ./M-P-1-09_FPIXTest-m20C-FNAL-160617-0924_2016-06-17_09h24m_1466173471 ./M-P-1-09_FPIXTest-m20C-FNAL-160620-1231_2016-06-20_12h31m_1466443919

The test performed on the 17th has the decoding errors.

On Wed, Jun 22, 2016 at 10:33 AM, jstupak notifications@github.com wrote:

This still does not answer the question of what actually happened though. The -20C results (and therefore PXAR_ERRORS data) that were uploaded came from the test at 09h24m, and the corresponding moreweb results show errors.

Since the lessweb code is deterministic, and Jamie says he gets 594 errors when he reruns it on the same elCom output, I would guess the shifter did something dumb to cause this (unless we see this for other modules as well).

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/puhep/pudb/issues/189#issuecomment-227782666, or mute the thread https://github.com/notifications/unsubscribe/AElNLOY1I_3q2IntNpFAiMxVS5PytZC4ks5qOVXNgaJpZM4I516O .

gneeser commented 8 years ago

Does this mean that there actually was an error in the DB?

NickHinton commented 7 years ago

I think this issue is resolved. I'm going to close the issue.

lantone commented 7 years ago

Not so fast! We noticed another instance of this just now:

http://inky.physics.purdue.edu/cmsfpix//Submission_p/summary/summaryFull.php?name=M-Q-8-08

vs.

http://inky.physics.purdue.edu/cmsfpix//MoReWeb/Results/REV001/R001/M-Q-8-08_FPIXTest-m20C-FNAL-160819-1007-300V_2016-08-19_10h08m_1471619289/QualificationGroup/ModuleFPIXTest_m20_1/TestResult.html

How does the DB distinguish +17 and -20 test results? the PXAR_ERROR field is in the XML for both, would later +17 results overwrite previous -20 results?

lantone commented 7 years ago

If so I can commit a quick fix for this, just let me know.

NickHinton commented 7 years ago

My understanding is that the latest results are the ones that will be displayed. Greg, can you confirm/deny this?

Looks like the 17C tests were run after the -20C ones.

gneeser commented 7 years ago

I can confirm that only the latest results will be displayed/recorded. This is true for almost all data submitted in the xml, and where it's not true it's explicitly stated (e.g. the IV curves, etc.).

lantone commented 7 years ago

Ok then, I'll submit a fix that will remove the pXar error count from the +17 XML file.

gneeser commented 7 years ago

Just as a disclaimer, be aware that pretty much any results you submit in the +17C test will override the -20C results if you submit it afterward. It's not just the pXar error count, it's also the # of bad double columns, bad bumps, ROC failure modes, etc. The database only has one entry in the MySQL for these results, and subsequent entries override previous ones.

lantone commented 7 years ago

good to know. i think the error count is the only offending item in the +17 XML, but i'll double-check.

lantone commented 7 years ago

should be fixed with 3a8e4e5

gneeser commented 7 years ago

Alright then, let's keep this issue open for a while just to remind ourselves to keep an eye on it, then presumably we'll be able to close it in a few days.

gneeser commented 7 years ago

Were there any other problems on this issue that you all noted?

-Greg

lantone commented 7 years ago

not that i've noticed (and i've been mostly paying attention)

gneeser commented 7 years ago

Alright, I'll close the issue then. As always, let me know if you catch anything else.