sxs-collaboration / sxs

Python code for manipulating data from the SXS collaboration
MIT License
25 stars 18 forks source link

LVC conversion: better tolerances #8

Closed moble closed 4 years ago

moble commented 4 years ago

@geoffrey4444 wants to convert our entire catalog to LVC format. The main problem with that is that the conversion process takes of order hours per waveform, so without any improvements, we would expect this process to take of order a year on a single computer.

After looking through the data, it's become clear that the real problem here is the phase — more specifically, the use of phase as one of the data series while requiring the same tolerance on the amplitude and phase. The result is that splines of the phase in the output LVC format

  1. take hundreds to thousands of times longer to construct than than they need to, and
  2. occupy tens to hundreds of times more space than they need to

for the resulting waveform to have the desired accuracy.

Each mode of the input waveform is first decomposed into amp and (unwrapped) phase, and then passed to romspline.ReducedOrderSpline, which builds a spline that should be able to reproduce the input data to within an L-infinity tolerance of 1e-6. Note that this tolerance is absolute, not relative.

Presumably, the goal here is to produce a waveform for consumption by, e.g., LAL that is accurate to 1e-6. This is basically achieved with amp. The amp will typically be no larger than 0.1 (or orders of magnitude less for higher-order modes) except very near merger, meaning the relative tolerance is more like 1e-5 or less restrictive — but the absolute error will indeed be 1e-6.

Meanwhile the phase will generally be tens to tens of thousands (depending on junk and/or precession), meaning that the relative tolerance will be 1e-7 or more restrictive. This is why the phase splines perform so much worse than amp.

The answer seems obvious: rather than requiring |δphase|<1e-6, we need to be able to require amp*|δphase|<1e-6. Better yet, to keep the results as accurate as before, we should require |δamp|<5e-7 and amp*|δphase|<5e-7, because now the latter will be contributing roughly equally with the former and L-infinity errors should add linearly.

But then, would LVC people accept such waveforms? I know everyone talks about how the phase really needs to be accurate, but I claim it will still be (at least nearly) as accurate as it was before. Specifically, this is not a cumulative phase error; it is an interpolation error. This means that there will still be many times throughout the output waveform that will have precisely zero phase error (relative to the input waveform), but there will be small fluctuations of the phase error between those points. These enter into the waveform in very much the same way as the amplitude errors, so they should be considered equivalent.

moble commented 4 years ago

I've implemented the above suggestion (plus a few others) in #10, and I find a speedup of ~166x and a filesize reduction of ~12x.

moble commented 4 years ago

On a vacuum phone call, we agreed that this approach, with |δamp|<5e-7 and amp*|δphase|<5e-7, is the thing to do.

moble commented 4 years ago

Closed by #10