Speedup pspecrun - Githubissues

steven-murray commented 1 year ago

The main thing this PR does is cache the get_Q_alt method, because it was taking up ~95% of the computation time for H6C runs (with many baseline pairs). For a small frequency and time window (and only baselines <60m), it was taking ~15hrs to compute the time-average power spectra. With these updates it takes ~4 hrs.

For completeness, I'm including the line-profile output of the run with this update: lineprof.txt

codecov[bot] commented 1 year ago

Codecov Report

Patch coverage: 98.50% and project coverage change: +0.01 :tada:

Comparison is base (96339e4) 95.98% compared to head (55a73bb) 95.99%.

Additional details and impacted files

```diff @@ Coverage Diff @@ ## main #381 +/- ## ========================================== + Coverage 95.98% 95.99% +0.01% ========================================== Files 17 17 Lines 6053 6072 +19 ========================================== + Hits 5810 5829 +19 Misses 243 243 ``` | Flag | Coverage Δ | | |---|---|---| | unittests | `95.99% <98.50%> (+0.01%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=HERA-Team#carryforward-flags-in-the-pull-request-comment) to find out more. | [Impacted Files](https://app.codecov.io/gh/HERA-Team/hera_pspec/pull/381?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=HERA-Team) | Coverage Δ | | |---|---|---| | [hera\_pspec/pspecdata.py](https://app.codecov.io/gh/HERA-Team/hera_pspec/pull/381?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=HERA-Team#diff-aGVyYV9wc3BlYy9wc3BlY2RhdGEucHk=) | `97.11% <97.91%> (ø)` | | | [hera\_pspec/utils.py](https://app.codecov.io/gh/HERA-Team/hera_pspec/pull/381?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=HERA-Team#diff-aGVyYV9wc3BlYy91dGlscy5weQ==) | `91.07% <100.00%> (+0.31%)` | :arrow_up: |

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

steven-murray commented 1 year ago

@philbull this depends on #380 so that should be merged first.

steven-murray commented 1 year ago

@adeliegorce I've merged in main -- sorry, should have done that previously. I addressed your comments to specific bits of code also.

I'm not quite sure what you mean about the doubling-up of get_Q_alt? The plain (non-cached) function for get_Q_alt is in utils, and I wrapped that in a cache inside the PSpecData object (as self._get_qalt_cached), and that function gets called by self.get_Q_alt(), which does a bit of setup first. I think this is the best way to organize it -- the cool thing about caching the function inside the class is that the maximum size of the cache can be set more intelligently.

adeliegorce commented 1 year ago

Cool, thanks for the changes. I did spot the Q_alt being read from cache in the new version - I must have been looking at the wrong files last week. Could I just ask you to add the two missing tests? (see codecov report, #L3644 and 2263 of pspecdata)

steven-murray commented 1 year ago

@adeliegorce sorry for the delay on this, just got to it now. I've added a test of one of the lines, but the other has to do with a warning raised when a the phase_type of a constituent dataset is not drift. The only reason that line is in this PR is because I updated it from being a print statement to a warning (i.e. it obviously wasn't covered previously either). More concerning, the newer versions of pyuvdata don't support the drift phase_type (or even setting phase_type at all!). This should be fixed up properly in its own PR, I think.

adeliegorce commented 1 year ago

Ok I will merge and delete the branch! Thanks @steven-murray

HERA-Team / hera_pspec

Speedup pspecrun #381

Codecov Report