cta-observatory / ctapipe

Low-level data processing pipeline software for CTAO or similar arrays of Imaging Atmospheric Cherenkov Telescopes
https://ctapipe.readthedocs.org
BSD 3-Clause "New" or "Revised" License
64 stars 268 forks source link

About zfit file reading #639

Closed FrancaCassol closed 5 years ago

FrancaCassol commented 6 years ago

Hi, I would like to read a NectarCam zfit file in ctapipe. I see there have been quiet long issues on this subject but it is hard to understand which is the present status (I don't see a test/example of the zfits_event_source module). Could please somebody summarizes which is the present status of the code? is it working? Which external libraries must be installed?

If necessary, I am ready to help in the development.

Cheers

kosack commented 6 years ago

I think the short answer is that it's not quite working, but not far from being ready. It's related directly to PR #598 and to the recent restructuring of the EventSource code that was implemented in #617

Right now there is no test code or way to validate zfits files, since we have no example data in this format, but we could include some if somebody has it. Do you know if the "zfits" files from for example FACT use the same format as for NectarCam, etc?

dneise commented 6 years ago

Hello Franca. I am currently working on #598 and am also a member of FACT. Indeed the zfits format in FACT is different from the one used in SST1M for example. I do not know what is used for NectarCam.

You can find a SST1M fits file reader for python 3.5 and 3.6 in https://github.com/cta-sst-1m/protozfitsreader

It is called protozfitsreader, since the SST1M zfits files contain google protobuffers. There are also 2 example files in that repository: https://github.com/cta-sst-1m/protozfitsreader/tree/master/protozfitsreader/tests/resources

If you would like to have a look at FACT zfits files for comparison or so, there are of course also a few events bundled with the fact analysis tool "fact-tools": https://github.com/fact-project/fact-tools/tree/master/src/test/resources

this one for example: https://github.com/fact-project/fact-tools/blob/master/src/test/resources/testDataFile.fits.fz really just a few events if I remember correctly.

If you have a few NectarCam event in zfits format for me to look at, I would of course be interested... maybe we can combine forces somehow.

calispac commented 6 years ago

@Etienne12345 might be of help here ! We just shorty discussed and apparently the "C++ part" of the reader can be used for NectarCam however the "python interface" might differ.

FrancaCassol commented 6 years ago

Thanks to all for the infos.

Dominik, indeed we could combine the forces. I am interested to read as soon as possible NectarCam data in ctapipe, but this work will be helpful also for a lot of other cameras that are using zfit format (LST for example).

You can find an example of NectarCAM data in https://www.cppm.in2p3.fr/~cassol/Run0374.fits.fz

I will have a look at your protozfitsreader. What is missing in order to have it working in catapipe? How do you think we should proceed?

Etienne12345 commented 6 years ago

Hello Franca,

indeed the python module "rawzfitsreader" should be usable as-is for NectarCam. All it does is read the protobuf structures from the zfits into python. To go all the way to ctapipe containers, you most probably must adapt digicam's ZFITS class, or write your own. As it sounds silly to duplicate efforts, I'd advise that we try and merge the two, possibly taking the ZFITS class from digicampipe to its own, separate repository. I am not sure that putting it as is into ctapipe is such a good idea as we need it for digicampipe. This being said, I am not sure that both digicampipe and ctapipe containers are identical. Are they ? I looked a little bit at your example files and I found the following:

dneise commented 6 years ago

I will have a look at your protozfitsreader.

To install:

    conda install numpy protobuf libgcc
    pip install https://github.com/cta-sst-1m/protozfitsreader/archive/v0.43.1.tar.gz

To try it out, I did (in ipython)

In [1]: from protozfitsreader import rawzfitsreader
In [2]: path = 'Downloads/Run0374.fits.fz'
In [3]: rawzfitsreader.listAllTables(path)
Attempting to list all tables from Downloads/Run0374.fits.fz
Out[3]: 'RunHeader:Events:'
In [4]: rawzfitsreader.open(path+':Events')
Out[4]: 0
In [5]: from protozfitsreader.L0_pb2 import CameraEvent
In [6]: e = CameraEvent()
In [7]: raw_event_string = rawzfitsreader.readEvent()
In [8]: e.ParseFromString(raw_event_string)
Out[8]: 17468
In [9]: # raw_event_string is really just the protobuffer, not interesting for humans
In [10]: raw_event_string[:10]
Out[10]: b'\x18\x01%\x01\x00\x00\x002\xf7B'
In [11]: # e  is now a google protobuffer. It has named attributes, one can acces just like this
In [12]: e.eventNumber
Out[12]: 1
In [13]: e.loGain.waveforms.num_samples
Out[13]: 60

The field "e.loGain.waveforms.samples" is an array ... have a look at it ... its string representation is very large. There is a little convenience function, which can be e.g. used like this:

from protozfitsreader import toNumPyArray
lo_gain_samples = toNumPyArray(
    e.loGain.waveforms.samples
).reshape(-1, e.loGain.waveforms.num_samples)

In [33]: lo_gain_samples.shape
Out[33]: (70, 60)

In [34]: lo_gain_samples.dtype
Out[34]: dtype('uint16')

with these commands you should be able to "play" with protozfitsreader. As Etienne already said, in digicam the files have no runheaders, so I have no experience with them.

FrancaCassol commented 6 years ago

Hi, in order to start, I followed the instructions of Dominik, but I have a problem in the import of the rawzfitsreader:


ImportError Traceback (most recent call last)

in () ----> 1 from protozfitsreader import rawzfitsreader /Users/franca/anaconda/lib/python3.6/site-packages/protozfitsreader/__init__.py in () 4 # import protozfitsreader 5 import numpy as np ----> 6 from . import rawzfitsreader 7 from . import L0_pb2 8 ImportError: cannot import name 'rawzfitsreader' ---------------------------------------------------------------------------- I am not really expert of python, but indeed I don't see the rawzfitsreader.py module in the directory https://github.com/cta-sst-1m/protozfitsreader/tree/master/protozfitsreader ... Thanks for any help Franca
Etienne12345 commented 6 years ago

It looks like you are on OSX, while the installer is configured for Linux. You might have to install the rawzfitsreader by hand (pip). However, there is only the version for python 3.5 available right now: http://www.isdc.unige.ch/~lyard/repo/ProtoZFitsReader-0.44.Python3.5.Darwin.x86_64.tar.gz

I must setup an anaconda environnement with python 3.6 before I can generate the package: it might take me a couple of days...

FrancaCassol commented 6 years ago

Thanks Etienne! Don't worry, I can wait for it (I will also try to move everything to Linux, but it is always practical to be able to test things in its own laptop)

Anyway as far I understood the rawzfitsreader python code is in https://forge.in2p3.fr/projects/ctaactl/repository/changes/CamerasToACTL/trunk/IO/Fits/Python/protozfitsreader.py

isn't it? In the meanwhile I can take sometime to study the code (including digicampipe and ctapipe containers )...

Etienne12345 commented 6 years ago

Hello Franca, generating for python 3.6 was easier than expected: http://www.isdc.unige.ch/~lyard/repo/ProtoZFitsReader-0.44.Python3.6.Darwin.x86_64.tar.gz

The file (protozfitsreader.py) that you pointed out was the very first version of the interface: it most probably does not comply to the ctapipe containers anymore. But indeed, that would be a good start to learn how things work. Long story short:

I'd guess that looking also at the current state of the digicampipe layer put on top of all that would be a good idea (https://github.com/cta-sst-1m/digicampipe/blob/master/digicampipe/io/zfits.py).

So right now we have 3 layers:

Maybe it would be worth to make everything common up to protozfitsreader.py, and keep zfits.py custom to the camera / package ? Ideally everything should be common, but I hardly see that happen in the near futur.

Cheers ! PS: I investigated the compression of your example file a little bit further, and I end up with a compression ratio of 2.81, i.e. file size = 606MB

dneise commented 6 years ago

I am not really expert of python, but indeed I don't see the rawzfitsreader.py module in the directory https://github.com/cta-sst-1m/protozfitsreader/tree/master/protozfitsreader ...

The import rawzfitsreader does not look for a rawzfitsreader.py. Not all import statements look for Python files. Also (compiled) C/C++ libraries can be imported, these files on linux have the extension ".so".

A well known example for compiled library, which should be installed on your system is math. For an example, which is not a compiled library, but a normal python file we use path :-D

If you repeat this on your system:

>>> import math
>>> math.__file__
'/home/dneise/anaconda3/lib/python3.6/lib-dynload/math.cpython-36m-x86_64-linux-gnu.so'
>>> import path
>>> path.__file__
'/home/dneise/anaconda3/lib/python3.6/site-packages/path.py'

You should see similar paths ... just I would assume, that in your case the math file is called: math.cpython-36m-x86_64-darwin-gnu.so or so... right?

dneise commented 6 years ago

Anyway as far I understood the rawzfitsreader python code is in https://forge.in2p3.fr/projects/ctaactl/repository/changes/CamerasToACTL/trunk/IO/Fits/Python/protozfitsreader.py

So here the answer would be: No.

The code for rawzfitsreader is (compiled) in the .so file with this long name. The source code for it is in the cta svn

protozfitsreader.py uses rawzfitsreader.

dneise commented 6 years ago

Etienne wrote:

Maybe it would be worth to make everything common up to protozfitsreader.py, and keep zfits.py custom to the camera / package ? Ideally everything should be common, but I hardly see that happen in the near future.

I see this very similar. In principle the protobuf object, coming out of the rawzfitsreader is already very nice, since it allows direct access via named attributes to the contents of each event.

Unfortunately the protobuf interface does not feel very "pythonic" and the array-like-contents are not (automagically / behind the scenes) converted into numpy arrays (which is usually what scientific pythonistas like to work with).

By "the interface feels unpythonic" I mean, that the "event-instance" is not really explorable by either "tab-comletion" or looking at its help. See here for example, how the event object reacts on "tab-completion-exploration":

2018-02-08-133631_1920x1080_scrot

It clearly shows a lot of attributes and useful methods, but naively one would expect maybe a list of actual event properties, like "adc_samples", "event_id" .. or so. Unless the event is expected to be "dict-like". Python dics to not allow for direct dot-access to its contents, and so dicts are typically explored via their keys() method. Many custom classes, which are not dicts, but somehow contain stuff, have used the same interface as dicts have, for element access, like the pandas.Dataframe, or the numpy.recarray. However Google has decided to not provide a dict-like interface.

Trying help() leads to a Seg-Fault, which is maybe not Googles fault by a minor problem on our side somewhere ... I admit, I never looked at the help of an event before now:

In [1]: from protozfitsreader import rawzfitsreader
In [2]: from protozfitsreader.L0_pb2 import CameraEvent
In [3]: rawzfitsreader.open('Run0374.fits.fz:Events')
Out[3]: 0
In [4]: e = CameraEvent()
In [5]: e.ParseFromString(rawzfitsreader.readEvent())
Out[5]: 17468
In [6]: help(e)
Segmentation fault (core dumped)

The google protobuf instance has a pretty exhaustive string representation, which one gets printed by simply printing the event or str(event) or so ... but it is sometimes a little too large to really enjoy it.


A way to find out all the names of the event attributes I found was this:

In [9]: list(e.DESCRIPTOR.fields_by_name)

It can be used to get a feeling for what is stored in the event, it could be used like this, to get a feeling about stored types and values:

In [42]: for name in list(e.DESCRIPTOR.fields_by_name):
    ...:     value = getattr(e, name)
    ...:     type_name = type(value).__name__
    ...:     print(
    ...:         '{name:<26s}:{type: <20s}:{value!r}'.format(
    ...:             name=name, 
    ...:             type=type_name, 
    ...:             value=str(value)[:40]
    ...:         )
    ...:     )
telescopeID               :int                 :'-1'
dateMJD                   :float               :'0.0'
eventType                 :int                 :'1'
eventNumber               :int                 :'1'
arrayEvtNum               :int                 :'0'
hiGain                    :PixelsChannel       :'waveforms {\n  samples {\n    type: U16\n  '
loGain                    :PixelsChannel       :'waveforms {\n  samples {\n    type: U16\n  '
trig                      :CameraTrigger       :''
head                      :CameraRunHeader     :''
muon                      :CircleParams        :''
geometry                  :CameraGeometry      :''
hilo_offset               :int                 :'0'
hilo_scale                :int                 :'0'
cameraCounters            :CameraCounters      :'counters {\n  type: U16\n  data: "\\001\\000'
moduleStatus              :ModuleStatus        :'status {\n  type: U8\n  data: "\\001\\001\\00'
pixelPresence             :PixelPresence       :'presence {\n  type: U32\n  data: "\\000\\000'
acquisitionMode           :int                 :'61158'
uctsDataPresence          :bool                :'False'
uctsData                  :UCTSdata            :'data {\n  type: U8\n  data: "\\000\\000\\000\\'
tibDataPresence           :bool                :'True'
tibData                   :TIBdata             :'data {\n  type: U8\n  data: "\\001\\000\\000\\'
swatDataPresence          :bool                :'False'
swatData                  :SWATdata            :'data {\n  type: U8\n  data: "\\000\\000\\000\\'
chipsFlags                :ChipsFlags          :''
firstCapacitorIds         :FirstCapacitorIds   :''
drsTagsHiGain             :DrsTags             :''
drsTagsLoGain             :DrsTags             :''
local_time_nanosec        :int                 :'0'
local_time_sec            :int                 :'0'
pixels_flags              :AnyArray            :''
trigger_map               :AnyArray            :''
event_type                :int                 :'0'
trigger_input_traces      :AnyArray            :''
trigger_output_patch7     :AnyArray            :''
trigger_output_patch19    :AnyArray            :''
trigger_output_muon       :AnyArray            :''
gps_status                :bool                :'False'
time_utc                  :AnyArray            :''
time_ns                   :int                 :'0'
time_s                    :int                 :'0'
flags                     :int                 :'0'
ssc                       :int                 :'0'
pkt_len                   :int                 :'0'
muon_tag                  :bool                :'False'
trpdm                     :AnyArray            :''
pdmdt                     :AnyArray            :''
pdmt                      :AnyArray            :''
daqtime                   :AnyArray            :''
ptm                       :AnyArray            :''
trpxlid                   :AnyArray            :''
pdmdac                    :AnyArray            :''
pdmpc                     :AnyArray            :''
pdmhi                     :AnyArray            :''
pdmlo                     :AnyArray            :''
daqmode                   :AnyArray            :''
varsamp                   :AnyArray            :''
pdmsum                    :AnyArray            :''
pdmsumsq                  :AnyArray            :''
pulser                    :float               :'0.0'
ftimeoffset               :AnyArray            :''
ftimestamp                :AnyArray            :''
num_gains                 :int                 :'0'

I have chosen to print the values string representation up to 40-characters, to not flood the output, since some of the string reprs are very long. However, we already see. There are a lot of things of type "AnyArray", and these can nicely be converted into numpy arrays. All the primitive types like float, int and bool do not have to be converted at all. And then there fields with a custom types like e.g. hiGain of type PixelsChannel. These custom types to my understanding, can again be explored using their DESCRIPTOR property.

So an event can be seen (in Python lingo) as a dict which can contain, primitives, or numpy-arrays, or sub-dicts.

So I fully agree with Etienne, that the conversion from googles protobuf to something, that contains exactly the same data, but is just a little bit "easier" to handle than a protobuf, could (and should) be done in a common place for everybody.

Then conversion from this common interface to ctapipe-containers should be a) quite easy and might b) be telescope specific, so this can be separated.

dneise commented 6 years ago

Ah another thing, I should mention: If one looks at the string representation of the event, (which is so large it is flodding the screen), one can see for example this:

pixelPresence {
  presence {
    type: U32
    data: "\000\000\000\000\000\000\000\000\000\000\000\000"
  }
}
acquisitionMode: DAQCHARGESAMPLE
uctsDataPresence: false
uctsData {
  data {
    type: U8
    data: "\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000"
  }
}

So acquisitionMode: DAQCHARGESAMPLE, but in my exploration one comment above I got:

acquisitionMode           :int                 :'61158'

So it appears there is somewhere in the protobuf object a mapping (maybe a enum-like) from the integer: 61158 to the string "DAQCHARGESAMPLE".

However, it is not used, when I retrieve the value from the event, it is only used inside the string representation.

See again:

In [44]: e.acquisitionMode
Out[44]: 61158

In [45]: type(e.acquisitionMode)
Out[45]: int

as opposed to the string representation:

In [49]: [line for line in str(e).splitlines() if line.startswith('acquisitionMode')]
Out[49]: ['acquisitionMode: DAQCHARGESAMPLE']

This is of course a pity. Users are maybe not so much interested in the value 61158 but rather would like to have the enum value DAQCHARGESAMPLE.

Etienne12345 commented 6 years ago

That's most probably because protobufs do understand enum types, and enums are stored as integers internally. So the help() makes uses of this and substitutes the enum value, while the "raw dump" of the field still produces the integer value.

dneise commented 6 years ago

Yes exactly

dneise commented 6 years ago

Python also knows enums ... https://docs.python.org/3/library/enum.html just the values are not "correctly" converted into enums. when handed to the user. And that is a pity.

I have not studies the Python-google-protobuf docu well enough yet to see what google proposes here.

Etienne12345 commented 6 years ago

Well right now, the protobufzfits.py simply moves the protobuf arrays to numpy arrays. As the data model is well known by this module, we could imagine to instead build a "true" python object from the protobuf structures ? Then this "true" python object would be the one used by the custom layers to populate the containers.

dneise commented 6 years ago

Yes, I think, this is exactly what I was trying to say ... I am not sure though ... this "protobufzfits.py" I have not seen so far.

FrancaCassol commented 6 years ago

Hi Dominik and Etienne, thanks for all these very useful posts. I will have a deeper look at all this next week. If you agree we could then plan a skype meeting so to be sure to move in the correct direction.

kosack commented 6 years ago

Just a small comment - there is an effort going on to unify the output of all cameras, and thus the current protobuf description may be simplified in the near future, so I wouldn't worry too much about its current complexity.

So I fully agree with Etienne, that the conversion from googles protobuf to something, that contains exactly the same data, but is just a little bit "easier" to handle than a protobuf, could (and should) be done in a common place for everybody

That's the exact reason for having the Container classes in ctapipe - to abstract the data from the underlying data format, and have it common for all input types. We wanted to avoid having the pipeline tied to a specific data format or underlying structure.

JulienHoules commented 6 years ago

Hi,

this is the explanation for the weird acquisition mode (needs to be changed for the output of camera server) :

define CONF_FIXED_CAMERAINTERFACE__DAQCHARGE_FLAG 0x0000EEE0

define CONF_FIXED_CAMERAINTERFACE__DAQCHARGE_T_FLAG 0x0000EEE1

// Sample mode

define CONF_FIXED_CAMERAINTERFACE__DAQSAMPLE_FLAG 0x0000EEE3

define CONF_FIXED_CAMERAINTERFACE__DAQSAMPLE_C_FLAG 0x0000EEE4

define CONF_FIXED_CAMERAINTERFACE__DAQSAMPLE_D_FLAG 0x0000EEE5

// Dual mode

define CONF_FIXED_CAMERAINTERFACE__DAQCHARGESAMPLE_FLAG 0x0000EEE6

typedef enum { E_ACQ_MODEDAQCHARGE = (tUInt32)CONF_FIXED_CAMERAINTERFACEDAQCHARGE_FLAG, E_ACQ_MODEDAQCHARGE_T = (tUInt32)CONF_FIXED_CAMERAINTERFACEDAQCHARGE_T_FLAG, E_ACQ_MODEDAQSAMPLE = (tUInt32)CONF_FIXED_CAMERAINTERFACE__DAQSAMPLE_FLAG, E_ACQ_MODEDAQSAMPLE_C = (tUInt32)CONF_FIXED_CAMERAINTERFACEDAQSAMPLE_C_FLAG, E_ACQ_MODEDAQSAMPLE_D = (tUInt32)CONF_FIXED_CAMERAINTERFACEDAQSAMPLE_D_FLAG, E_ACQ_MODEDAQCHARGESAMPLE = (tUInt32)CONF_FIXED_CAMERAINTERFACE__DAQCHARGESAMPLE_FLAG } tAcqMode;

dneise commented 6 years ago

Hello @FrancaCassol we have just released v0.44.1 of the protozfitsreader. Maybe yo want to have a look at that:

https://github.com/cta-sst-1m/protozfitsreader

https://github.com/cta-sst-1m/protozfitsreader/releases

FrancaCassol commented 6 years ago

Hi Dominik, thanks! After some days spent on other duties, this week I am back on the topic, let see how far I can go ;-)

sizun commented 6 years ago

Hello,

For information, Stephen Fegan wrote a high level python API for reading NectarCAM ZFITS files. See for example https://github.com/llr-cta/calin/blob/02fa9d5496b67f6c54dfb2356cf4d3238f7100ec/examples/iact_data/event%20viewer.ipynb I am not sure it can help in integrating NectarCAM data into ctapipe but I thought I would mention it. It is a wrapped around the CamerasToACTL library and also uses protobuf (3).

Cheers, Patrick

dneise commented 6 years ago

Thanks @sizun. That is really interesting for me. It seems more and more people spend time solving very similar problems. This in my opinion underlines again nicely that CTA scientists during prototyping and comissioning of the telescopes indeed have very similar needs. I personally believe ctapipe may be a good place to combine such efforts and identify useful interfaces for common solutions.

Etienne12345 commented 6 years ago

Thanks for the tip Patrick ! I had no idea that Stephen had also interfaced to python (I thought he was working with C++ only). Anyways, after browsing a little bit the repo, here are the differences I found between the two implementations:

kosack commented 6 years ago

As long as zfits remains yet another prototype, I am not sure that we should produce "the best possible interface". Sounds like wasted resources to me if in the end another file format is selected. Any opinion

I agree with that - right now something basic that works is all that is needed, and we shouldn't put too much effort in until we know the final data format. Nevertheless, if the camera groups agree on using protobufs for R1.TEL.EVT data, any code we write to support their conversion to DL0.TEL.EVT can probably be reused, even if the file format changes.

I was unaware of calin, actually, but it seems to do a lot more than just read data and is more of an analysis framework for the NectarCam testbench (there are similar codes from other cameras, though a few use ctapipe underneath), so it may be overkill for this purpose (unless we extract only the part that does the deserialization).

pawel21 commented 6 years ago

Hi, I would like read fits.fz file by use ctapipe. According documentiation I tried:

from ctapipe.io.eventsource import EventSource    

event_source = EventSource(input_url='Run028.000.fits.fz')

I got error: : Can't instantiate abstract class EventSource with abstract methods _generator, is_compatible.

Recently, I saw new class SST1MEventSource in ctapipe.io, but I count't find example how use it. I would appreciate it if somebody give me example how read fits by use ctapipe.

Cheers

kosack commented 6 years ago

I would appreciate it if somebody give me example how read fits by use ctapipe.

This is still a very new feature, and not part of any stable release, so I can't guarantee it will work for any given .fz file, but normally the quickest way to construct an event source is either via:

from ctapipe.io import event_source

for event in event_source(input_url='Run028.000.fits.fz'):
    # so something with events

or by constructing an EventSourceFactory and calling it's produce() method to get an EventSource

To get compressed fits working, however, you need to install several other libraries.

Etienne12345 commented 6 years ago

I am not sure at all that this currently works in ctapipe: the latest zfits reader was only tested with digicampipe. I am not aware of someone who ported this part of the code from digicampipe to ctapipe. Anyone ?

dneise commented 6 years ago

I discussed this privately with @pawel21 and Julain Sitarek a little bit. So just to complete this here...

There is a protozfits PR #20 which I created after I grew aware that not only digicam is using these fits.fz files. It is not yet merged, so still subject to changes ... but anyway there is a README which should always reflect the latest interface.

This might also be interesting for @FrancaCassol and maybe also for SST1M/digicam people like @calispac


Now, I think here arises an important question, which touches the realms of all telescopes, as well as the expertise of both @Etienne12345 and @kosack. The story goes like this:

The protozfits.SimpleFile does not emit ctapipe.Containers but collections.namedtuples. protozfits is a dependency of ctapipe, so importing from ctapipe.core would create some circular dependency.

The protozfits.SimpleFile interface I propose in the PR mentioned above, does not apply any mapping, but (hopefully) transparently reflects the exact same interface as is defined in the L0.proto file one can find in the svn.

Now ... I have the impression more and more people would like to read their "R0" level (highly camera specific) data from fits.fz files into ctapipe. And since inside ctapipe, data is stored in ctapipe.core.Containers people of course ask how to "convert" their data into ctapipe containers.

I think there are two obvious and extreme ways here.

1. follow pre-existing R0CameraContainer structure.

In ctapipe.io.containers the R0CameraContainer is defined. So one way would be to fill these pre-existing 4 fields: trigger_time, trigger_type, waveform, num_samples and drop information, which does not fit into any field.

2. Inherit naming scheme from L0.proto

We could convert transparently and thus avoid mental mapping. So what is named event.hiGain.waveforms.samples in L0.proto, could be named data_container.r0.tel[i].hiGain.waveforms.samples in ctapipe.

In addition to these two options, I think, there is a basically unlimited number of hybrids, like e.g. this solution here, where a few custom fields were added, but not only added, they were also ever so slightly renamed. https://github.com/cta-observatory/ctapipe/blob/a00df3b79b9c831a90f5982f1036598b26cc633a/ctapipe/io/containers.py#L34-L47

My personal opinion is that any hybrid should be avoided, since is combines the downsides of both extreme solutions. Neither does the newly created SST1MCameraContainer feel natural for ctapipe developers, nor does it feel natural to SST1M camera specialists. Both groups of people have to learn the field names and both groups do not get what they expect. For ctapipe people, the container contains "too much" .. and for SSt1M people it contains "not enough".

I would be very much interested in learning the grand plan here. Independent from any technical implementation ... just the plan.

dneise commented 6 years ago

I was just having another look at different protobuf-fits.fz files and realized that they I was not able to find out which telescope wrote that file, just by looking at its contents.

I mean, they obviously differ, NectarCam files container hi- and loGain, while digicam files contain only hiGain. But this is not an obivious telescope-type marker, like an entry in the Runheader which contains a string (or an enum). Yesterday night I tried to write a general eventsource for NectarCAM, LST and digigcam, but I already got stuck at the beginning, when I could not really find out which is which.

Also when we implement simply 3 different event sources for LST, NectarCAM and digicam, the problem is just moved to the is_compatible function which will not know how to decide if the NectarCAM reader can read a LST file or not.... the current implementation of the SST1M_EventSource would answer wrongly: "Yes, I can read a NectarCAM file".

Etienne12345 commented 6 years ago

So far raw data has been written without considering this. There can indeed be run headers in the files, but I do not believe that any of the teams using zfits used it yet. Mostly because they did not worry about other cameras. From monte-carlo the reader in camerastoactl doesn't put a run header either. This can of course be fixed as there are provisions for such a feature in the writing code. Question is: what do we put, where and how. For the writing of raw data, we can either add an argument in the writer, or let the camera server announce itself (both are provisionned). For reading monte-carlo, did you attempt to load data from simtel files ? If so, from which tool ?

FrancaCassol commented 6 years ago

I find great the idea to write a unique eventsource for NectarCAM, LST and digicam!! I am right now studying the SST1Mevetnsource in order to use it as example for a NectarCam eventsource, but it would be much more elegant to have only one eventsource for all the cameras that employ the zfit format (taking also into account that the NectarCam and LST event builders ere written by the same person).

I also support the point raised by Dominik about the naming scheme, should we not take the occasion to fix a unique naming scheme? Whatever it is ...

dneise commented 6 years ago

For reading monte-carlo, did you attempt to load data from simtel files ? If so, from which tool ?

No I personally have never tried this yet. I would not even know where to find these...

dneise commented 6 years ago

It seems, one simple way for the event source to distinguish between NectarCam, digicam and LST files today would be to look at the "TTYPE" names in the header of the "Events" bintable, which all of them have.

For example:

I found this out based on very few example files, so I might be mistaken, but it seems possible today ... and we can improve when ever decisions about the run-headers or so have been made.

@Etienne12345 would you agree that this is a valid method to distinguish these files-types at the moment, or do you think this will simply break in the real world, since there are many different files with different TTYPEs out there?

For example here https://github.com/cta-observatory/ctapipe/pull/668/commits/4e0149e2bad734b9a5e5d446daa40c2a7ca1a58a I put this kind of test into the sst1m event source, to not give a false positive on is_compatible when using it with a non-sst1m-file ...

of course a unit test is still needed for this feature now .. but I am now sure we have example files to perform the tests on....

Etienne12345 commented 6 years ago

Hi Dominik,

You probably do not need to deal with anything FITS-related: you get the protocol buffer structures in the protozfitsreader, and can do the tests directly on these, no ?

Regarding the fields, I happen to sit right next to Dirk right now who confirmed me that integrals are always outputted by nectar and that lst always have drs data.

For digicam I an double checking right now if the baselines are always outputted. I'll get back to you when Matthieu replies.

Another option would be to use the number of pixels until zero suppression is enabled, but I remember that nectar is probably not taking data with the full set of pixels right now.

dneise commented 6 years ago

You probably do not need to deal with anything FITS-related: you get the protocol buffer structures in the protozfitsreader, and can do the tests directly on these, no ?

I personally find it easier to ask the fits tables, for their TTYPEs, than asking the protocol buffer. This is because the protocol buffer always has all its fields. They are just filled with a default value, when not explicitly filled with data from the bintable. So for me personally this question:

But I guess both is possible.


Using the number of pixels sounds also very good. However I was unable to find a reliable way of finding that out. There seems to be no field containing the number of pixels. Alternatively one can use the waveforms and divide their lengths by the number of samples, but not all telescopes always set the number of samples:

Here I look at event.hiGain.waveforms.samples and event.hiGain.waveforms.num_samples for the 3 example files I have:

LST     len(samples) 5320 num_samples: 40
Nectar  len(samples) 4200 num_samples: 60
digicam len(samples) 64800 num_samples: 0

You see? I could find out the number of pixels for 2 of them, but for digigcam I would have to say: when num_samples it is digicam ...

dneise commented 6 years ago

I think this issue here became a very long discussion. I know @FrancaCassol is currently working on a NectarCam Event Source in her fork and might soon-ish open a first PR for this....

I think this issue could be closed amd be replaced by specific questions once they arise ...

What do you think Franca?

FrancaCassol commented 6 years ago

I agree Dominik, I will very soon open a PR for a preliminary nectarCAM reader. This very useful issue can be closed