cta-observatory / pyeventio

Python read-only implementation for the EventIO data format used by the CORSIKA 7 IACT extension and sim_telarray
MIT License
10 stars 12 forks source link

Support for new random mono trigger in sim_telarray #261

Closed orelgueta closed 1 year ago

orelgueta commented 1 year ago

Konrad implemented a new RANDOM_MONO_PROBability option to allow a certain fraction of triggered telescope events to randomly pass the stereo event selection in a way very similar to muon ring candidates do. This was a request from the NectarCam team to allow studying muons triggering single telescopes in shower events (see this issue).

As a result, the trigger block has changed in order to flag such random mono triggers. Reading a file which includes the random mono trigger with the current pyeventio release leads to the following error

Traceback (most recent call last):
    trigger_information = o.parse()
  File "anaconda3/envs/eventio/lib/python3.9/site-packages/eventio/simtel/objects.py", line 620, in parse
    comp['alt'] = read_float(byte_stream)
  File "anaconda3/envs/eventio/lib/python3.9/site-packages/eventio/tools.py", line 28, in read_float
    return struct.unpack('<f', f.read(4))[0]
struct.error: unpack requires a buffer of 4 bytes

Support for the random trigger should be added to pyeventio. The exact details of the new trigger block will be provided soon. In the meantime, an example output file which was produced with a random mono trigger can be found in this link.

maxnoe commented 1 year ago

I would expect that any changes to the binary format of eventio are signaled by a version number increase of the corresponding object and that you then get an error specifically about an unsupported object version.

Are you sure the file is OK?

orelgueta commented 1 year ago

It seems this change is seen only when a mono trigger is actually triggered. I simulated three different samples with the same version, one without the RANDOM_MONO_PROBability option, one with that option set to a very low probability and one with setting it to high probability so that we definitely get at least one event. Only in the latter case I got the error, when there are no mono events triggering, the file is read without issues.

I therefore conclude that some trigger bits are changed only when for those mono triggers and not for the entire sample and that the version number was not changed. I can provide example files if you think they are useful.

maxnoe commented 1 year ago

The Version check for the trigger is there: https://github.com/cta-observatory/pyeventio/blob/94ebedd3c38ed10810a3aa9eafd734c986fdf56e/src/eventio/simtel/objects.py#L569

orelgueta commented 1 year ago

Yes and from the error above it seems the version remained at 3 because otherwise we wouldn't have gotten an error about comp['alt'] or comp['speed_of_light'] (which I saw in a different file).

Tobychev commented 1 year ago

I think what Max means is that the version being the same despite there being a change in the binary format is a bug in the software generating these files.

mdpunch commented 1 year ago

I think what Max means is that the version being the same despite there being a change in the binary format is a bug in the software generating these files.

A bug in simtelarray or further downstream, Tomas?

orelgueta commented 1 year ago

Yes it could be considered a mistake in sim_telarray (or the hessio package) to keep the version the same if the binary format changed. This needs to be clarified (an e-mail was sent).

kbernloehr commented 1 year ago

Well, I do not think of the binary format changed but more of the unspecified bits put to actual use. Not any of the bits declared as 'reserved (must be zero)' but of the 'unspecified (not used so far)' kind. While this might seem like a semantic difference, but what I tried telling with it is that these bits might eventually be put to mean something. If you don't know what they mean, try to ignore them (perhaps after issuing a warning) and carry on.

What would you gain from an incremented version number? You would need updated software even if the bits in question were never used. Or would you want to jump between version numbers from one event to the next, depending on whether the previously unspecified bits are used or not in that event?

From the source code via doxygen into hessio_refman.pdf:

6.46.2.4 teltrg_type_mask int simtel_central_event_data_struct::teltrg_type_mask[H_MAX_TEL] Bit mask which type of trigger fired. More than one trigger type per telescope is possible. Bits well beyond H_MAX_TRG_TYPES are modifier flags, in addition to trigger type(s). Bit 0: (mostly analog) majority trigger. Bit 1: analog sum trigger. Bit 2: digital sum trigger (working on readout data stream). Bit 3: digital (majority) trigger (working on separately digitized data). Bits 4-7: reserved (must be zero). Bit 8: long-event modifier (readout may include more samples than normal). Bit 9: matching muon-ring enhancement conditions, acceptable mono event. Bit 10: randomly chosen as acceptable mono event. Bits 11-15: unspecified (not used so far). Bits 16-31: reserved (must be zero).

orelgueta commented 1 year ago

@maxnoe could I please ask for your help in implementing the necessary changes? If I understand correctly, in the following line we don't read enough bits to support this in the teltrg_type_mask, is that correct?

https://github.com/cta-observatory/pyeventio/blob/94ebedd3c38ed10810a3aa9eafd734c986fdf56e/src/eventio/simtel/objects.py#L600

What makes me doubt it is that even if bits 4-31 are zero, they still would have been read by the next line reading the stream. But as far as I know, we don't see such an error in the trigger times read out.

maxnoe commented 1 year ago

@kbernloehr it seems there is different logical needed to read the object in case of a mono trigger, so it can't just be "ignored". Otherwise the Reader wouldn't have thrown an error.

I'll have a look later, currently not near a computer.

maxnoe commented 1 year ago

This line could be the culprit if a new trigger bit pattern was added: https://github.com/cta-observatory/pyeventio/blob/94ebedd3c38ed10810a3aa9eafd734c986fdf56e/src/eventio/simtel/objects.py#L609

maxnoe commented 1 year ago

The correct solution is probably to use a function that counts the number of 1 bits and checks if it's 1.

https://docs.python.org/3.10/library/stdtypes.html#int.bit_count

orelgueta commented 1 year ago

Yes this line will not pick up the new random trigger and therefore will not read the trigger time from the stream. However, in these lines

https://github.com/cta-observatory/pyeventio/blob/94ebedd3c38ed10810a3aa9eafd734c986fdf56e/src/eventio/simtel/objects.py#L600-L602

We read only an 8 bits int while Konrad describes a 32 bit mask and the random trigger is at the 10th bit which we wouldn't read with uint8 if I understand correctly. I will wait for you to be in front of a computer anyway. It's not urgent.

kbernloehr commented 1 year ago

(referring to both of Max's comments earlier ...: if mask not in ... and counting of bits)

No, that would not be correct. We had simulations in the past where a telescope could have multiple trigger types, for example to compare the thresholds achieved with majority versus analog sum etc. The digital majority is already documented but not implemented (that would be mask = 0b1000, if it is the only trigger type in the telescope). The only bits required to be zero are, currently, 4-7 and 16-31. All other bits can come in any combination.

maxnoe commented 1 year ago

Fixed in #261

maxnoe commented 1 year ago

We read only an 8 bits int while Konrad describes a 32 bit mask and the random trigger is at the 10th bit which we wouldn't read with uint8 if I understand correctly.

Yes, this is true, but not the reason why it failed. Even with the mono trigger, the trigger masked was still below 128, so the variable int array used to store the 32-bit value was actually stored using one byte.

However, the check below for the storing of individual trigger times per trigger type didn't work, as two bits were now set.

I fixed both, we now correctly read the variable length integer and only look at the first 4 bits for the individual trigger times.

kosack commented 1 year ago

Fixed in https://github.com/cta-observatory/pyeventio/issues/261

I think you mean #262

mdpunch commented 1 year ago

Thanks for all your efforts on this (it's my fault it's being tested now!)...

Question maybe to Kark @kosack: At what point do the different bits for the event classification in simtelarray get translated into the "A.4.3 Event type Definition." of the R1 Data Model?

As discussed elsewhere, these random mono triggers could be either trigger type == 2 (ACADA requested this event be captured) or probably better to use one of the unused trigger_type >3, for "special MC simulation events" or something (which would have the advantage that the subtypes, or event_type can then be used to "encode" some of the other bits.

Maybe this isn't the best forum to discuss this, but I don't know where is?

maxnoe commented 1 year ago

This is done in the ctapipe.io.SimTelEventSource here:

https://github.com/cta-observatory/ctapipe/blob/73a154591f4935084aa44887d8a86f735de9a3d5/ctapipe/io/simteleventsource.py#L923

But it seems we just always fill SUBARRAY for data events, and are not yet looking at these trigger bits.

So that's something to be added.