HERA-Team / hera_cal

Library for HERA data reduction, including redundant calibration, absolute calibration, and LST-binning.
MIT License
13 stars 8 forks source link

Use filter files #916

Closed r-pascua closed 9 months ago

r-pascua commented 9 months ago

This PR adds the ability to use the tophat filtering code with yaml files containing pre-computed filter parameters. The core assumption here is that there is one filter file per spectral window, and the filter files have the following structure:

filter_centers:
    (ant1, ant2): blah
filter_half_widths:
    (ant1, ant2): blah

where (ant1, ant2) here is a placeholder for all of the baselines in the array (or a superset of baselines in the array). I am open to changing the file format slightly, since yaml.SafeLoader doesn't know how to interpret python tuples; we would just need to agree on a format to use, document it, and stick with it. I have written simple unit tests for the code functionality, but I haven't yet tried a more sophisticated test (i.e., on something that looks like real data with many baselines).

I also added some line breaks here and there because the code was wrapping in my terminal.

codecov[bot] commented 9 months ago

Codecov Report

Attention: 2 lines in your changes are missing coverage. Please review.

Comparison is base (959420a) 97.18% compared to head (d6de897) 97.17%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #916 +/- ## ========================================== - Coverage 97.18% 97.17% -0.01% ========================================== Files 23 23 Lines 10446 10486 +40 ========================================== + Hits 10152 10190 +38 - Misses 294 296 +2 ``` | [Flag](https://app.codecov.io/gh/HERA-Team/hera_cal/pull/916/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=HERA-Team) | Coverage Δ | | |---|---|---| | [unittests](https://app.codecov.io/gh/HERA-Team/hera_cal/pull/916/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=HERA-Team) | `97.17% <98.46%> (-0.01%)` | :arrow_down: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=HERA-Team#carryforward-flags-in-the-pull-request-comment) to find out more. | [Files](https://app.codecov.io/gh/HERA-Team/hera_cal/pull/916?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=HERA-Team) | Coverage Δ | | |---|---|---| | [hera\_cal/frf.py](https://app.codecov.io/gh/HERA-Team/hera_cal/pull/916?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=HERA-Team#diff-aGVyYV9jYWwvZnJmLnB5) | `97.39% <98.46%> (-0.14%)` | :arrow_down: |

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

r-pascua commented 9 months ago

Thank you for the review @jsdillon! I will make the recommended changes. I apologize for adding a bunch of fluff in the form of whitespace changes, but I develop in vim and it was practically impossible to read the code in the terminal with how much line wrapping there was. I will keep this in mind if I make future additions to the code.

Before addressing your comments, I would like to get your opinion on the file format. Ideally, I would access the filter parameters with an antenna pair tuple; however, python tuples are not supported by yaml.SafeLoader (which is what I figure we would want to use). I have two proposals, and would like your input on which you would prefer.

  1. Use baseline integers for the dictionary keys, assuming 350 antennas. This would require an extra step of converting the baseline integers to antenna pair tuples after reading in the file, but this feels like a very easy conversion to the format expected by the filtering code.
  2. Use strings for the dictionary keys, formatted like "(0,1)" for example. This feels like the worse of the two options, since this is susceptible to whitespace issues (e.g., assuming this format would cause the code to miss keys formatted like "(0, 1)"), and generally feels kind of clunky.

Thank you again for the review! I can make all of the requested changes once we settle on a final format for the filter files.

jsdillon commented 9 months ago

What do you mean by baseline integers? Like the ones in pyuvdata? Those aren't human readable...

r-pascua commented 9 months ago

Yeah, like the ones from pyuvdata. The main problem I'm butting up against is that yaml.SafeLoader doesn't work with python tuples as keys, and using tuples as keys gives really wonky looking yaml files. Here's an example:

filter_centers:
  ? &id001 !!python/tuple
  - 0
  - 1
  : 0.123
  ? &id002 !!python/tuple
  - 0
  - 2
  : 0.173
filter_half_widths:
  *id001: 0.07
  *id002: 0.08
r-pascua commented 9 months ago

OK, I have updated the code so that filter parameter files use strings for the antenna pairs, and there is an extra bit of code that converts those strings into tuples. I have also fleshed out the documentation regarding the parameter files and added an example parameter file. Here is what the sample file looks like:

filter_centers:
  (0, 1): 0.1234
  (0, 2): 0.173
filter_half_widths:
  (0, 1): 0.05
  (0, 2): 0.08

Please let me know what additional changes you would like to see when you have time to go through the changes.