nanograv / PINT

PINT is not TEMPO3 -- Software for high-precision pulsar timing
Other
118 stars 101 forks source link

Clock corrections applied to TZRMJD even if TZRSITE is BAT #1691

Closed gcerib closed 3 months ago

gcerib commented 9 months ago

Dear all,

I am in the field of very-high energy pulsars with imaging Cherenkov telescopes such as MAGIC and CTA. Since we typically have to integrate tens of hours of observations (scattered across many months) to detect the pulsed gamma rays and build a TOA, we are forced to apply the barycenter corrections to each single event we record, with most of them being just background cosmic rays, and obtain the pulse profile from already barycentered data. I find convenient for me to produce par files where TZRSITE is set to "@" or "bat", so that I can manipulate the phase=0 definition in a relatively easy way by changing TZRMJD, and adapt it to various different references present in literature.

I think I've found an inconsistency between the behaviour of PINT and tempo2 in the specific case in which TZRSITE is set to "@" and the clock is set to TT(BIPM****). Specifically, PINT applies the clock corrections to TZRMJD, whereas tempo2 doesn't. This causes the absolute phases calculated by tempo2 and PINT to differ by ~(-27us/F0), even when the BATs are identical down to the precision of np.longdouble. I was suggested to report it here by Paul Ray over the facebook pulsar group.

I believe I've identified the origin of the discrepancy in the definition of get_TZR_toa(self, toas) in $PINT/src/pint/models/absolute_phase.py and the respective function refphs_init(pulsar* psr, int npsr) in $TEMPO2/refphs.C . The tempo2 function contains:

21             if (obs->telID[0]=='@'            /* Have barycentric arrival time */
22                     || strcasecmp(obs->telID,"bat")==0)
23             {
24                 obs->clockCorr=0;  /* therefore don't do clock corrections */
25                 obs->delayCorr=0;
26             }

whereas the PINT function does not have such a check. The docstring of get_TZR_toa() even mentions that "We are treating the TZRMJD as a special TOA. Note that any observatory clock corrections will be applied to this TOA, as with any other TOA. This does not affect the value of the TZRMJD parameter, however." Indeed, if I add such a check on the value of self.TZRSITE in get_TZR_toa() of absolute_phase.py, the discrepancy in the phases disappears:

111         if self.TZRSITE.value != '@' and self.TZRSITE.value.lower() != 'bat':
112             tz = toa.get_TOAs_array(
113                 (self.TZRMJD.quantity.jd1 - 2400000.5, self.TZRMJD.quantity.jd2),
114                 obs=self.TZRSITE.value,
115                 freqs=self.TZRFRQ.quantity,
116                 include_bipm=clkc_info["include_bipm"],
117                 include_gps=clkc_info["include_gps"],
118                 ephem=toas.ephem,
119                 planets=toas.planets,
120                 tzr=True,
121             )
122         else:
123             tz = toa.get_TOAs_array(
124                 (self.TZRMJD.quantity.jd1 - 2400000.5, self.TZRMJD.quantity.jd2),
125                 obs=self.TZRSITE.value,
126                 freqs=self.TZRFRQ.quantity,
127                 include_bipm=False,
128                 include_gps=False,
129                 ephem=toas.ephem,
130                 planets=toas.planets,
131                 tzr=True,
132             )

To compare the results, both PINT and tempo2 need to be run with the same .par and .tim files. Unfortunately I do not know a way to print the absolute phases in a human-readable format with tempo2. I've written a patch to the "general2" plugin of tempo2, that allows the fractional part of the phase to be printed using a "{phasefrac}" token (the integer part can already be printed with the {npulse} token). I've checked that the phases produced in this way are compatible with the ones produced by MAGIC's own tempo2 plugin (that writes to some horrible .root file format). Please find attached an archive with the patch and some example scripts.

Summary

Expected behaviour: tempo2 and PINT phases and barycentered times should be the same when ran on the same .tim and .par files Observed behaviour: PINT phases differ by an amount of ~ (-27us/F0) when TZRSITE is "@" or "BAT" and CLK is "TT(BIPM2021)".

Steps to reproduce:

  1. Download the archive below. The archive contains a fictional tim file for several possible observatiories, a sample par file derived from JBO's Crab monthly ephemerides, the patch to be applied to tempo2's general2 plugin, a bash script to execute tempo2, and a python script that runs PINT.
  2. Recompile tempo2 with the patch that allows the absolute phases to be printed to the terminal (I am convinced there must be an easier way to do it, but I am not aware of it).
  3. Run the runTempo.sh bash script that just executes "tempo2 -output general2 -f sample.par fake.tim -s "{bat} {phasefrac}\n" > tempo2_result.txt"
  4. Run the testPint.py script that computes the phases and BATs with PINT and compares them with tempo2.
  5. On my machine, the PINT and tempo2 BATs differ by 1e-12 MJD (i.e. at the precision of np.longdouble), whereas the phases differ by 8.17e-4 ~ 27us/F0.

PINT version: 0.9.8+5.g0227e13c.dirty tempo2 version: 2023.05.1 (commit 3610511302af0f6f62d2a9806177301516a9babf)

output of pint.print_info():

# Created: 2023-12-10T14:43:11.955338
# PINT_version: 0.9.8+5.g0227e13c.dirty
# User: giovanni
# Host: ThinkPad-T450s
# OS: Linux-6.5.0-13-generic-x86_64-with-glibc2.38
# Python: 3.11.6 (main, Oct  8 2023, 05:06:43) [GCC 13.2.0]
# endian: little
# numpy_version: 1.26.2
# numpy_longdouble_precision: float128
# scipy_version: 1.11.4
# astropy_version: 6.0.0
# pyerfa_version: 2.0.1.1
# jplephem_version: 2.21
# matplotlib_version: 3.8.2
# loguru_version: 0.7.2
# Python_prefix: /home/giovanni/.venv/pinttest
# PINT_file: /home/giovanni/.venv/pinttest/lib/python3.11/site-packages/pint/__init__.py
# Environment: virtualenv
# virtualenv_prefix: /home/giovanni/.venv/pinttest

PINT-tempo2__TZRSITE-BAT.tar.gz

Thank you in advance for your help and patience, and sorry for the long message!

-- Dr. Giovanni Ceribella Max-Planck-Institut für Physik Boltzmannstr. 8 85748 Garching ceribell@mpp.mpg.de

abhisrkckl commented 8 months ago

I think the solution you mentioned is correct. Can you open a pull request?

gcerib commented 8 months ago

Hi Abhimanyu,

I am having difficulties installing the PINT development environment to prepare a branch with the modification, following https://nanograv-pint.readthedocs.io/en/latest/contributing.html#pull-request-guidelines .

Conda (miniconda 23.11.0) doesn't find the package flake8-diff>=0.2.2 in the conda-forge channel, and I had to install it manually with pip. It also doesn't recognize the syntax "black~=23.0" used in requirements_dev.txt , I had to replace it with black>=23.0,<=24.0 .

If I install PINT from the master branch and run the tests (before modifying any code at all), they fail:

===== short test summary info ===== ERROR tests/test_noisefit.py - OSError: [Errno 39] Directory not empty: 'to-zap' ERROR tests/test_noisefit.py - OSError: [Errno 39] Directory not empty: 'to-zap' ERROR tests/test_noisefit.py - OSError: [Errno 39] Directory not empty: 'to-zap' ERROR tests/test_noisefit.py - OSError: [Errno 39] Directory not empty: 'to-zap' ERROR tests/test_noisefit.py - OSError: [Errno 39] Directory not empty: 'to-zap' ERROR tests/test_toa_shuffle.py - OSError: [Errno 39] Directory not empty: 'to-zap' ERROR gw14 ERROR gw1 ERROR gw3 ERROR gw4 ERROR gw5 ERROR gw10 ERROR gw15 ERROR gw7 ERROR gw6 ERROR gw2 ERROR gw12 ERROR gw13 ERROR gw11 ERROR gw0 ERROR gw8 ===== 85 warnings, 21 errors in 313.27s (0:05:13) =====

I get similar results using tox. Everything has been run in a clean environment, as suggested on the previously linked page. To be sure, I ran the entire thing twice in different servers with different systems, and got more or less the same result.

Shall I carry on regardless of these errors, or do you have a suggestion on what should be done?

Cheers,

Giovanni

abhisrkckl commented 8 months ago

These errors look like they are due to some kind of OS permission issue. I think you can ignore these.

test_noisefit.py and test_toa_shuffle.py are not related to the bug you encountered.

I think you can proceed regardless of these errors. As long as the github CI tests pass, it should be fine.

gcerib commented 8 months ago

Hi again, While trying to build a test for PR #1707, I realized the issue is more deep than I previously thought. In tempo2's https://bitbucket.org/psrsoft/tempo2/src/master/readTimfile.C, there is a block identical to the one in refphs.C that checks if the site is the barycenter. If the condition is met, it disables clock corrections for any barycentered TOA, not just the TZR ones!

435            if (psr->obsn[nObs].telID[0]=='@'            /* Have barycentric arrival time */
436                    || strcasecmp(psr->obsn[nObs].telID,"bat")==0) 
437            {
438                psr->obsn[nObs].clockCorr=0;  /* therefore don't do clock corrections */
439                psr->obsn[nObs].delayCorr=0;
440            }
441            else if (strcmp(psr->obsn[nObs].telID,"STL")==0)
442            {
443                psr->obsn[nObs].clockCorr=0;  /* don't do clock corrections */
444                psr->obsn[nObs].delayCorr=1;
445            }
446            else if (strcmp(psr->obsn[nObs].telID,"STL_FBAT")==0)
447            {
448                psr->obsn[nObs].clockCorr=0;  /* don't do clock corrections */
449                psr->obsn[nObs].delayCorr=1;
450            }
451            else
452            {
453                psr->obsn[nObs].clockCorr=1;
454                psr->obsn[nObs].delayCorr=1;
455            }

This is not the case for PINT: none of pint.toa methods have such a check, thus including clock corrections also for already barycentered TOAs. This results in a difference in the BATs computed by PINT and tempo2, whenever they are provided a tim file with already barycentered TOAs. I understand the use case for running PINT on already barycentered TOAs is quite narrow, however, this seems to be another inconsistency between the behaviour of tempo2 and PINT. The issue can be tested using the same minimal working example I uploaded at the beginning, the first TOA in the tim file is a barycentered one.

I can try and modify the functions in pint.toa to include a check there as well, but it seems me this would have more far-reaching consequences than just the TZRMJD issue. Before proceeding, I would like to ask your opinion.

Cheers,

Giovanni

abhisrkckl commented 8 months ago

I am not sure what "STL" and "STL_FBAT" are, but they are not supported by PINT. So the rest of the checks are not required for now.

gcerib commented 8 months ago

Hi, OK, Let's leave out "STL" and "STL_FBAT". But still tempo2 treats any barycentered TOA just as in TZR case, that means, skipping clock corrections, whereas PINT doesn't. Shall I try and implement the check for the TOA's site in toa.py as well, besides the one in absolute_phase.py?

abhisrkckl commented 8 months ago

From what I can see, what you say is the intended behaviour. It doesn't happen that way due to a bug in get_observatory() function. Please see my comment in the PR.

gcerib commented 7 months ago

I am sorry, but #1711 does NOT fix the issue in this problem. The clock corrections are still applied to TZRMJD even if get_observatory() does not overwrite the values it passed, as discussed in this comment: https://github.com/nanograv/PINT/pull/1707#discussion_r1441839631. The issue is still present, and I would like to reopen this.

abhisrkckl commented 7 months ago

Can you post a code snippet that shows this?

gcerib commented 7 months ago

Hi @abhisrkckl , Actually there is the test I posted for pull request #1707 that demonstrates it. Before you opened #1711 I ran it on a version with the same changes, and is still failed: https://github.com/nanograv/PINT/pull/1707/commits/114a8299f594befe612d365429a51b4465ace194

abhisrkckl commented 7 months ago

Thanks. I am able to see the issue. I'll investigate this.

paulray commented 6 months ago

Thank you for finding this! I think I'm seeing this bug in some of my NICER pulsar timing work. It would be great to get it fixed! I see there have been some attempts but evidently it is complicated!

gcerib commented 4 months ago

Dear all, Has there been any progress or some idea on how to proceed with this bug? As of now, if I set TZRMJD to BAT, I need to produce separate par files for usage with PINT and tempo2. Cheers, Giovanni

PS: congratulations on the submission of the Likelihood fitting paper! I've seen it on Arxiv the other day.

paulray commented 4 months ago

I agree this is a high priority. I just passed some big deadlines so I have a bit more time to think about this. For now, you could just run with CLK TT(TAI) instead of TT(BIPMxx), right?

paulray commented 4 months ago

Perhaps the fix should be in the clock_corrections method of the Observatory class?

def clock_corrections(self, t, limits="warn"):
        """Compute clock corrections for a Time array.

        Given an array-valued Time, return the clock corrections
        as a numpy array, with units.  These values are to be added to the
        raw TOAs in order to refer them to the timescale specified by
        self.timescale."""
        # TODO this and derived methods should be changed to accept a TOA
        # table in addition to Time objects.  This will allow access to extra
        # TOA metadata which may be necessary in some cases.
        corr = np.zeros_like(t) * u.us

        if self.include_gps:
            corr += self.gps_correction(t, limits=limits)

        if self.include_bipm:
            corr += self.bipm_correction(t, self.bipm_version, limits=limits)

        return corr

I suggest that the GPS and BIPM corrections should not be applied if this site has a scale of 'tdb' already.
So, the include_gps and include_bipm could still be True, indicating that the user wants BIPM applied in general, but of course not for TOAs that are already barycentered.

paulray commented 4 months ago

Yes, just adding and self.timescale != "tdb" to those two if statements causes the test to pass. Not sure if that has any other unwanted consequences.

paulray commented 4 months ago

Looking at this more, I really don't like the way PINT handles BIPM and GPS corrections. They are kind of "tacked on" rather than being a part of a real clock correction chain like Tempo2 does.

More specifically, we carry around include_bipm and include_gps as a property of a site. This is probably not the best way to do this.

See this code in toa.py:

            site = get_observatory(
                obs,
                include_gps=include_gps,
                include_bipm=include_bipm,
                bipm_version=bipm_version,
            )
            clock_corrections = site.clock_corrections(
                time.Time(self["mjd"][grp]), limits=limits
            )

The include_bipm gets put into the site object, then when site.clock_corrections is called, it looks at that property of the site to see if BIPM should be applied (same for GPS). Wouldn't it make more sense for the clock_corrections() method to take include_bipm and include_gps as arguments and not keep those as properties of the site?

And the include_bipm should really signify the intent of the user that she wants to use TT(BIPM) instead of TT(TAI). Thus it is fine for include_bipm to be True even for a barycentric site. The fix I posted above ensures that a site with a timescale of TDB (meaning that the version of TT is irrelevant) won't apply that 27µs correction even if it is True.

Bottom line: I think my tiny fix will work above, but it might be better to do the larger change as well and remove include_bipm andinclude_gps as properties of a site object.

paulray commented 4 months ago

I think this line is buggy (L361 of topo_obs.py):

elif self.clock_files:
            msg = f"No clock corrections found for observatory {self.name} taken from file {self.clock_files}"
            if limits == "warn":
                log.warning(msg)
                corr = np.zeros_like(t) * u.us
            elif limits == "error":
                raise NoClockCorrections(msg)

The corr = should be corr += unless the intent is to discard the BIPM and GPS corrections when there is no clock file for an observatory.

paulray commented 4 months ago

@gcerib can you check if #1763 solves your issue?

gcerib commented 4 months ago

@paulray Hi Paul. Yes, I tested revision #1763 5efcc378 and I confirm it solves the bug. Thanks!