Segment-related error from pycbc_single_template

titodalcanton commented 7 years ago

Tom's run with the latest hyperbank [1] failed on pycbc_single_template followups for some IMBH injections. Here's one of the failing commands:

cd /home/tdent/ER10/analysis/banks_background/er10_hyperbank_0p6fp/output
pycbc_single_template  --segment-length 512 --segment-start-pad 144 --segment-end-pad 16 --psd-estimation median --psd-segment-length 16 --psd-segment-stride 8 --psd-inverse-length 16 --chisq-bins "0.6*get_freq('fSEOBNRv4Peak',params.mass1,params.mass2,params.spin1z,params.spin2z)**(2./3.)" --low-frequency-cutoff 20. --approximant 'SPAtmplt:mtotal<4' 'SEOBNRv4_ROM:else' --order -1 --processing-scheme mkl --window 10 --psd-num-segments 63 --taper-data 1 --allow-zero-padding  --autogating-threshold 100 --autogating-cluster 0.5 --autogating-width 0.25 --autogating-taper 0.25 --autogating-pad 16 --strain-high-pass 15. --sample-rate 2048 --pad-data 8 --injection-scale-factor 1000000 --frame-type H1_HOFT_C00 --channel-name H1:GDS-CALIB_STRAIN --use-params-of-closest-injection --trigger-time 1163059633.716048 --inspiral-segments  results/1._analysis_time/1.01_segment_data/H1L1-INSP_SEGMENTS-1161961217-1242000.xml  --injection-file  inj_files/H1L1-STRIP_INJECTIONS_IMBHB_INJ-1161961217-1242000.xml  --data-read-name DATA_ANALYSED --data-analyzed-name TRIGGERS_GENERATED --output-file  ~/H1-SINGLE_TEMPLATE_P1_INJ_PARAMS_NOINJ_5-1161961217-1242000.hdf --verbose

And the error:

Traceback (most recent call last):
  File "/home/tito/sw_install/pycbc_latest//bin/pycbc_single_template", line 4, in <module>
    __import__('pkg_resources').run_script('PyCBC==e2c46f', 'pycbc_single_template')
  File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 534, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 1438, in run_script
    execfile(script_filename, namespace, namespace)
  File "/home/tito/sw_install/pycbc_latest/lib/python2.7/site-packages/PyCBC-e2c46f-py2.7.egg/EGG-INFO/scripts/pycbc_single_template", line 203, in <module>
    strain_segments = strain.StrainSegments.from_cli(opt, gwstrain)
  File "/home/tito/sw_install/pycbc_latest/lib/python2.7/site-packages/PyCBC-e2c46f-py2.7.egg/pycbc/strain.py", line 962, in from_cli
    allow_zero_padding=opt.allow_zero_padding)
  File "/home/tito/sw_install/pycbc_latest/lib/python2.7/site-packages/PyCBC-e2c46f-py2.7.egg/pycbc/strain.py", line 827, in __init__
    raise ValueError(err_msg)
ValueError: Trigger end time must be within analysable window. Asked to end at 1163059832 but can only analyse to 1163059831.

The inspiral segments file contains two DATA_ANALYSED segments compatible with the --trigger-time, differing only by one second at the end: (1163059103.0, 1163059839.0) and (1163059103.0, 1163059840.0). In pycbc_single_template, select_segments() chooses the shorter one, which then leads to the error when trying to segment the data.

I'm not sure whether this is a bug in the inspiral segments or in select_segments() (or neither), but I'm opening this to document what I found.

[1] https://www.atlas.aei.uni-hannover.de/~tdent/LSC/er10/tuning/preh1_vlfc_thorne_0p6fp/5._injections/5.14_followup_of_missed_IMBHB_INJ/

titodalcanton commented 7 years ago

I can get pycbc_single_template to work with the following patch, which takes into account the padding in selecting the right data segment:

@@ -35,7 +35,7 @@ def subtract_template(stilde, template, snr, trigger_time, flow):
         stilde -= pycbc.waveform.utils.apply_fseries_time_shift(inverse, dt)
     return stilde

-def select_segments(fname, anal_name, data_name, ifo, time):
+def select_segments(fname, anal_name, data_name, ifo, time, pad_data):
     anal_segs = events.select_segments_by_definer(fname, anal_name, ifo)
     data_segs = events.select_segments_by_definer(fname, data_name, ifo)

@@ -69,7 +69,7 @@ def select_segments(fname, anal_name, data_name, ifo, time):
         # largest overlap with anal_time
         overlap = None
         for start, end in zip(s2, e2):
-            if start > anal_time[0] or end < anal_time[1]:
+            if start + pad_data > anal_time[0] or end - pad_data < anal_time[1]:
                 # The analysis time must lie within the data time, otherwise
                 # this clearly is not the right data segment!
                 continue
@@ -181,7 +181,7 @@ if opt.inspiral_segments:
     # trig start/end times
     anal_seg, data_seg = select_segments(opt.inspiral_segments,
                                     opt.data_analyzed_name, opt.data_read_name,
-                                    ifo, opt.trigger_time)
+                                    ifo, opt.trigger_time, opt.pad_data)
     opt.trig_start_time = anal_seg[0]
     opt.trig_end_time = anal_seg[1]
     opt.gps_start_time = data_seg[0] + opt.pad_data

Ian, is this the correct approach or is there another way this error should be addressed?

spxiwh commented 7 years ago

Thanks @titodalcanton. I thought I had tested edge cases here (and even edited the segment module to ensure that all segments get parsed here!), but it seems I've missed something.

What you say makes sense, but I need some more time to look over and check this is the correct issue. Not sure I'll get that done today though as I'm flying back to Europe later.

spxiwh commented 7 years ago

As you seem to have a solution can you move this to a pull request?

titodalcanton commented 7 years ago

Ok, one more data point. There are the two inspiral jobs producing the almost-identical data segments, and they have the following params:

--gps-start-time 1163059111
--gps-end-time 1163059831
--trig-start-time 1163059111
--trig-end-time 1163059471

and

--gps-start-time 1163059111
--gps-end-time 1163059832
--trig-start-time 1163059471
--trig-end-time 1163059832

These look sane to me, so it seems the issue is purely in how select_segments() is deciding how to handle the tricksy case, and I'm going to make a PR with my patch.

duncan-brown commented 7 years ago

@titodalcanton please tag it as 1.6.1 milestone and I can get a new release made.

spxiwh commented 7 years ago

Wait. These inputs make no sense. You say we have two data read entries : (1163059103.0, 1163059839.0) and (1163059103.0, 1163059840.0). Why are they like this? If the second job is reading all necessary data, there shouldn't be two jobs here at all.

spxiwh commented 7 years ago

I tried to comment out my reasons for the choices being made here:

https://github.com/ligo-cbc/pycbc/blob/master/bin/pycbc_single_template#L38

Not really sure adding pad-data here is needed. Data read in should already include padding (regardless of what arguments actually go to pycbc_inspiral) and data analysed should be accurate.

So I do not think padding is the issue.

titodalcanton commented 7 years ago

So is the issue instead that we have two inspiral jobs where one would be sufficient? That was my other guess...

spxiwh commented 7 years ago

If the lock goes between 1163059103.0 and 1163059840.0 (737 seconds) this should have been segmented as two inspiral jobs. However, each job should have read in 512 + (pad-data) * 2 seconds. What are the DATA_ANALYSED segments for this case, and can you confirm that this is the span of the lock?

The change you made does make sense though, and I'll approve it, but the existing code should have worked.

titodalcanton commented 7 years ago

From the daily page https://ldas-jobs.ligo-wa.caltech.edu/~detchar/summary/day/20161113/ I see that indeed there is such a segment of H1:DMT-ANALYSIS_READY:1,

1   1163059103.0    1163059840.0    737.0

From Tom's H1L1-INSP_SEGMENTS-1161961217-1242000.xml I see

segment_definer:segment_def_id:2 1163059103 1163059839
segment_definer:segment_def_id:2 1163059103 1163059840

where seg def 2 corresponds to H1's DATA_ANALYSED.

titodalcanton commented 7 years ago

Thanks for the zeal github, but reopening as this needs further investigation.

spxiwh commented 7 years ago

I've been following this up now, and there is an underlying problem here. The symptom is that PyCBC is only analysing segments longer than 720s and not 520s as expected. The issue is in the functions determining valid_time for the pycbc_inspiral executable. Need to explore this a bit more before proposing a solution. This wasn't noticed in testing because previously we used a 256s segment length. The fix in #1314 was still needed, that edge case was always possible (it happened if the lock length was 1s longer than the minimum length).

duncan-brown commented 7 years ago

@spxiwh I will go ahead and tag 1.6.1 with @titodalcanton's fix and then we can release 1.6.2 once you have fixed the underlying issue.

spxiwh commented 7 years ago

@duncan-brown Okay, but we should fix this before starting O2 analyses (and be aware of that!). I'll try and get a solution today or tomorrow.

duncan-brown commented 7 years ago

@spxiwh noted: https://github.com/ligo-cbc/pycbc/releases/tag/v1.6.1

gwastro / pycbc

Segment-related error from pycbc_single_template #1311