Not able to produce time calibration file with dead pixels

morcuended commented 3 years ago

In Feb 2021 module 155 was disable and data were taken with this configuration for the whole month until the beginning of March when all modules were operative again.

Command:

lstchain_data_create_time_calibration_file \
  --input-file=/fefs/aswg/data/real/R0/20210215/LST-1.1.Run03671.0000.fits.fz \
  --output-file=/fefs/aswg/data/real/running_analysis/20210215/v0.7.1/time_calibration.Run03671.0000.hdf5 \
  --pedestal-file=/fefs/aswg/data/real/running_analysis/20210215/v0.7.1/drs4_pedestal.Run03670.0000.fits \
  --config=/fefs/aswg/software/virtual_env/ctasoft/cta-lstchain/lstchain/data/onsite_camera_calibration_param.json \
  --run-summary-path=/fefs/aswg/data/real/monitoring/RunSummary/RunSummary_20210215.ecsv

Output:

Input file: /fefs/aswg/data/real/R0/20210215/LST-1.1.Run03671.0000.fits.fz
Number of events in each subrun: 20000
list of files: ['/fefs/aswg/data/real/R0/20210215/LST-1.1.Run03671.0000.fits.fz']
File 1 out of 1
Processing: /fefs/aswg/data/real/R0/20210215/LST-1.1.Run03671.0000.fits.fz
No drive report specified, pointing info will not be filled
Traceback (most recent call last):
  File "/fefs/aswg/software/virtual_env/anaconda3/envs/osa/bin/lstchain_data_create_time_calibration_file", line 8, in <module>
    sys.exit(main())
  File "/fefs/aswg/software/virtual_env/anaconda3/envs/osa/lib/python3.7/site-packages/lstchain/scripts/lstchain_data_create_time_calibration_file.py", line 104, in main
    timeCorr.calibrate_peak_time(event)
  File "/fefs/aswg/software/virtual_env/anaconda3/envs/osa/lib/python3.7/site-packages/lstchain/calib/camera/time_correction_calculate.py", line 108, in calibrate_peak_time
    self.first_cap_array[nr_module, :, :] = self.get_first_capacitor(event, nr_module)
  File "/fefs/aswg/software/virtual_env/anaconda3/envs/osa/lib/python3.7/site-packages/lstchain/calib/camera/time_correction_calculate.py", line 247, in get_first_capacitor
    fc[high_gain, i] = first_cap[j]
IndexError: index 0 is out of bounds for axis 0 with size 0

@pawel21 @FrancaCassol could you have a look at it?

moralejo commented 3 years ago

Was this a file in which the module was marked as "bad", or one of those in which it was really taken out of the stream? I thought both cases had been tested.

morcuended commented 3 years ago

The module was marked as "bad" on the 15th of Feb. The above error appears after this date. For the previous days (from 10th to 14th), the error is different from that and actually the same that has always raised whenever it has not been possible to produce the time calibration file.

For example for 2021-02-13

lstchain_data_create_time_calibration_file \
  --input-file=/fefs/aswg/data/real/R0/20210213/LST-1.1.Run03640.0000.fits.fz \
  --output-file=/fefs/aswg/data/real/running_analysis/20210213/v0.7.1/time_calibration.Run03640.0000.hdf5 \
  --pedestal-file=/fefs/aswg/data/real/running_analysis/20210213/v0.7.1/drs4_pedestal.Run03630.0000.fits \
  --config=/fefs/aswg/software/virtual_env/ctasoft/cta-lstchain/lstchain/data/onsite_camera_calibration_param.json \
  --run-summary-path=/fefs/aswg/data/real/monitoring/RunSummary/RunSummary_20210213.ecsv

the output is:

Input file: /fefs/aswg/data/real/R0/20210213/LST-1.1.Run03640.0000.fits.fz
Number of events in each subrun: 20000
list of files: ['/fefs/aswg/data/real/R0/20210213/LST-1.1.Run03640.0000.fits.fz']
File 1 out of 1
Processing: /fefs/aswg/data/real/R0/20210213/LST-1.1.Run03640.0000.fits.fz
No drive report specified, pointing info will not be filled
/fefs/aswg/software/virtual_env/anaconda3/envs/osa/lib/python3.7/site-packages/numba/np/ufunc/gufunc.py:151: RuntimeWarning: invalid value encountered in extract_around_peak
  return self.ufunc(*args, **kwargs)
/fefs/aswg/software/virtual_env/anaconda3/envs/osa/lib/python3.7/site-packages/numba/np/ufunc/gufunc.py:151: RuntimeWarning: invalid value encountered in extract_around_peak
  return self.ufunc(*args, **kwargs)
event id = 5000
event id = 10000
event id = 15000
event id = 20000
Traceback (most recent call last):
  File "/fefs/aswg/software/virtual_env/anaconda3/envs/osa/bin/lstchain_data_create_time_calibration_file", line 8, in <module>
    sys.exit(main())
  File "/fefs/aswg/software/virtual_env/anaconda3/envs/osa/lib/python3.7/site-packages/lstchain/scripts/lstchain_data_create_time_calibration_file.py", line 107, in main
    timeCorr.finalize()
  File "/fefs/aswg/software/virtual_env/anaconda3/envs/osa/lib/python3.7/site-packages/lstchain/calib/camera/time_correction_calculate.py", line 185, in finalize
    raise RuntimeError("Not enough events to coverage all capacitor. "
RuntimeError: Not enough events to coverage all capacitor. Please use more events to time calibration file.

moralejo commented 3 years ago

@FrancaCassol @pawel21 @maxnoe do you understand why this is failing? I thought this was one of the first things that was fixed when we started the changes towards the 0.7 release...

rlopezcoto commented 3 years ago

what was solved was the creation of the drs4 file with a missing module (#602), it seems that this is also present when creating a time calibration file...

FrancaCassol commented 3 years ago

yes, the code seems not compliant with a missing module. @pawel21, do you have time to have a look at that?

pawel21 commented 3 years ago

yes, the code seems not compliant with a missing module. @pawel21, do you have time to have a look at that?

Yes, I will look into it.

FrancaCassol commented 3 years ago

Wonderful. If we want to produce new time coefficients for each night, we must add also some control check plots for the time corrections, which are missing for the moment.

morcuended commented 3 years ago

Related to this I very often find the problem

Traceback (most recent call last):
  File "/fefs/aswg/software/virtual_env/anaconda3/envs/osa/bin/lstchain_data_create_time_calibration_file", line 8, in <module>
    sys.exit(main())
  File "/fefs/aswg/software/virtual_env/anaconda3/envs/osa/lib/python3.7/site-packages/lstchain/scripts/lstchain_data_create_time_calibration_file.py", line 107, in main
    timeCorr.finalize()
  File "/fefs/aswg/software/virtual_env/anaconda3/envs/osa/lib/python3.7/site-packages/lstchain/calib/camera/time_correction_calculate.py", line 185, in finalize
    raise RuntimeError("Not enough events to coverage all capacitor. "
RuntimeError: Not enough events to coverage all capacitor. Please use more events to time calibration file.

when trying to produce a time calibration file every night. Around 50% of the days in March present this problem. Here I use by default the 20k events set in the script, reading only one subrun. But from the past experience, it does not improve when trying to read 20k events from each of the available subruns (~5). Are these 20k events not enough? Is there an underlying problem with the script not being able to identify the events properly? Is there any related hardware problem?

It would be useful to have some checks on this to better understand what is happening. Otherwise, I am for fixing the time calibration file to be used on a certain period.

maxnoe commented 3 years ago

Time calibration is calculated on flat field events, correct?

These are tagged by the source via a heuristic looking at the R0 data after drs4 corrections.

Can you check how many flat field events are tagged in those files that fail? Maybe the values for the heuristic need to be tweaked for those runs.

maxnoe commented 3 years ago

I improved the error message here: https://github.com/cta-observatory/cta-lstchain/pull/678

pawel21 commented 3 years ago

when trying to produce a time calibration file every night. Around 50% of the days in March present this problem. Here I use by default the 20k events set in the script, reading only one subrun. But from the past experience, it does not improve when trying to read 20k events from each of the available subruns (~5). Are these 20k events not enough? Is there an underlying problem with the script not being able to identify the events properly? Is there any related hardware problem?

This occur usually when one or more pixels are "dead". Thanks @maxnoe for improved error message.

It would be useful to have some checks on this to better understand what is happening. Otherwise, I am for fixing the time calibration file to be used on a certain period.

I will develop code to produce time calibration check plots.

morcuended commented 3 years ago

Just to show the output message after #678 when there are not enough events:

Input file: /fefs/aswg/data/real/R0/20210410/LST-1.1.Run04361.0000.fits.fz
Number of events in each subrun: 20000
list of files: ['/fefs/aswg/data/real/R0/20210410/LST-1.1.Run04361.0000.fits.fz']
File 1 out of 1
Processing: /fefs/aswg/data/real/R0/20210410/LST-1.1.Run04361.0000.fits.fz
No drive report specified, pointing info will not be filled
event id = 5000
event id = 10000
event id = 15000
event id = 20000
Traceback (most recent call last):
  File "/fefs/aswg/software/virtual_env/anaconda3/envs/osa/bin/lstchain_data_create_time_calibration_file", line 8, in <module>
    sys.exit(main())
  File "/fefs/aswg/software/virtual_env/anaconda3/envs/osa/lib/python3.7/site-packages/lstchain/scripts/lstchain_data_create_time_calibration_file.py", line 107, in main
    timeCorr.finalize()
  File "/fefs/aswg/software/virtual_env/anaconda3/envs/osa/lib/python3.7/site-packages/lstchain/calib/camera/time_correction_calculate.py", line 188, in finalize
    "No data available for some capacitors. "
RuntimeError: No data available for some capacitors. It might help to use more events to create the calibration file. Available: 99.947%, Missing: 254

maxnoe commented 3 years ago

254 is less than one pixel worth of capacitors. So the problem is not switched off pixels?

moralejo commented 3 years ago

But 99.947% of 1855 is 1854, which seems like 1 pixel is wrong.

maxnoe commented 3 years ago

Ok, checking the code again, it combines 8 capacitors into one, 8 254 = 2032, which is just shy of the 1024 2 capacitors per drs4 channel on each chip for each gain.

moralejo commented 3 years ago

I am not sure I understand, one pixel has 4096 capacitors per gain... Is it because the time calibration is the same for all four 1024 cells-long ring buffers of a given pixel-gain?

pawel21 commented 3 years ago

In Run04361, one pixel (id=1470) is "dead"

moralejo commented 3 years ago

Hi, @morcuended just made me note this is still not fixed, and hence one single pixel makes the whole calibration fail which in turn means that we have to use an old calibration file for all pixels. We have this every single day in the on-site analysis. This is ~~quite~~ bad**, and I honestly thought that this had been solved, or at least that we had agreed that this was not a desired behaviour and would be fixed a.s.a.p.

@pawel21 can you please take care of it? If so, let us know when you can have it ready.

** edited: not a big deal as long as no pixel cluster is replaced. But obviously we want to be able to time-calibrate drs4 even when one pixel is faulty.

pawel21 commented 3 years ago

Hi, I am sorry, I forgot about it. I will fix it in this week.

morcuended commented 3 years ago

@FrancaCassol, @pawel21, @moralejo, after the recent module exchange, it has been possible to produce the time calibration file almost every day, with some exceptions. However, I think it would be convenient to still adapt the script so it is able to deal with off pixels.

moralejo commented 3 years ago

Hi @morcuended, thanks for the reminder. To be honest I did not remember this was still pending (probably the reassuring messages from @pawel21 back in April and May made us all relax about it ;)) @pawel21, did you look into this? It seems a rather simple modification of the code, right?

cta-observatory / cta-lstchain

Not able to produce time calibration file with dead pixels #673