gwnrtools / nr-catalog-tools

A unified interface to various catalogs of Numerical Relativity simulations of compact binary mergers.
https://github.com/gwnrtools/nr-catalog-tools
GNU General Public License v3.0
1 stars 5 forks source link

Bugfix remove junk #51

Open anuj137 opened 2 months ago

anuj137 commented 2 months ago

This pull request addresses a bug in the draft pull request: https://github.com/gwnrtools/nr-catalog-tools/pull/49. The following changes have been made compared to the current master branch (commit hash e431f348c56f93d859630b8c4853e053405f2a3c):

  1. An option to remove junk has been added in WaveformModes.get_td_waveform() via the boolean variable remove_junk. If set to True, the junk portion of the data is removed using reference_time before evaluating the modes.

Compared to pull request #49, the following key changes have been made:

  1. Time axis bug fix: The time array of the waveform is now updated if remove_junk=True. This prevents interpolation in the junk-removed portion of the data.
  2. The function remove_junk_from_modes has been added to handle the removal of modes, while the function modes_with_junk_removed is now only used to check if the junk portion has already been removed.
  3. A new variable, remove_junk_fudge_factor, has been introduced to WaveformModes.get_td_waveform(), with a default value of one. The junk portion of the data is now considered up to remove_junk_fudge_factor * reference_time. This fudge factor allows for the selection of slightly larger or smaller values than reference_time for removing the junk.
vaishakp commented 1 month ago

Some suggestions already discussed with @anuj137 , so that I don't forget :

  1. Remove junk radiation portion before evaluating the modes by slicing the WaveformModes object itself. This will avoid unnecessary computing wasted in evaluating the junk portion.
  2. It will be nice to cache the WaveformModes with junk radiation removed. Slicing a WaveformModes object can be expensive (data - all mode + time axes and attributes will be copied) so that repeated evaluations are not expensive (e.g. if we use this in a PE)
  3. Having freedom to choose junk time would be nice, but then caching becomes a little complex ( cache WaveformModes for all choices of junk time ? or cache just the default metadata suggested choice?) and may not be necessary
  4. Need to uniformly retrieve the junk time as different catalogs store this in different keys. It would impact performance if if statements are used to iterate through possiblities in get_td_waveform. So it maybe better to do this at the metadata level before constructing the WaveformModes obj.
  5. Avoid if-else statements in get_td_waveform wherever possible (see e.g. https://github.com/vaishakp/nr-catalog-tools/blob/f0e16eef4fb15c4a0889897c20e52390dda6acce/nrcatalogtools/waveform.py#L447)