nanograv / PINT

PINT is not TEMPO3 -- Software for high-precision pulsar timing
Other
118 stars 101 forks source link

Differences in fit results between machines (Mac vs Linux?) #1586

Open dlakaplan opened 1 year ago

dlakaplan commented 1 year ago

Following some testing with KK and IPTA people, we were trying to isolate a simple test for DMX fitting. We found that the results differed slightly between his machine and mine. I have since run it a number of times on different machines (Mac Intel, Mac M1, linux, the notebook server) and I find DMX values that differ by ~1e-4 between them. The results tend to either agree with my previous results (Mac) or KK's (linux).

Is this a known issue with fitting precision? I don't understand why there should be anything stochastic about this fit.

e.g., if I run it on my Mac I get:

(pulsar) kaplan@plock[~/NANOGrav/ipta]  % python ./fit_dmx.py
# python_version: 3.9.7 (default, Sep 16 2021, 08:50:36) 
[Clang 10.0.0 ]
# numpy_version: 1.21.2
# Created: 2023-06-08T09:58:45.098852
# PINT_version: 0.9.5
# User: kaplan
# Host: plock
# OS: macOS-10.16-x86_64-i386-64bit
Free parameters: ['DMX_0001', 'DMX_0002']
DMX_0001: (56568.16215638483 d-56568.26215653181 d): -0.022862539247389357 pc / cm3: 0.0009955822393405128 pc / cm3
DMX_0002: (59421.350336146425 d-59421.450336388356 d): 0.0040382142830872585 pc / cm3: 0.0009955822393405128 pc / cm3
Differences wrt DLK: [-0.  0.]
Differences wrt KK: [5.12432404e-05 1.00193347e-04]

while on the notebook server I get:

# python_version: 3.10.5 | packaged by conda-forge | (main, Jun 14 2022, 07:06:46) [GCC 10.3.0]
# numpy_version: 1.22.4
# Created: 2023-06-08T15:02:15.231072
# PINT_version: 0.9.3
# User: jovyan
# Host: 29d3ca5ec6f6
# OS: Linux-4.18.0-425.13.1.el8_7.x86_64-x86_64-with-glibc2.31
Free parameters: ['DMX_0001', 'DMX_0002']
DMX_0001: (56568.16215638483 d-56568.26215653181 d): -0.02286136775682614 pc / cm3: 0.0009955822393404807 pc / cm3
DMX_0002: (59421.350336146425 d-59421.450336388356 d): 0.0040378097214163 pc / cm3: 0.0009955822393404807 pc / cm3
Differences wrt DLK: [-5.12406146e-05 -1.00183309e-04]
Differences wrt KK: [-0.  0.]

My Mac is an M1 machine, but running conda in x86 mode. We have also tried on explicitly Intel Macs and get results that agree with the M1 Macs.

If anybody can test this on other combinations I'd be interested to see what they find. Fractional differences of 1e-4 in DMX aren't huge, but are much bigger than they should be. I haven't investigated other parameters (DMX is the only free parameter in this test).

dlakaplan commented 1 year ago

Test program (par & tim files included):

import sys
import numpy as np
import astropy
from pint.models import get_model_and_toas
import pint.fitter
import pint.utils
import pint.logging
import io

par = """# Created: 2023-05-23T10:08:47.546618
# PINT_version: 0.9.5
# User: M. A. Krishnakumar (kkma)
# Host: poe
# OS: Linux-5.4.0-148-generic-x86_64-with-glibc2.29
# Format: pint
PSRJ                           J0621+1002
EPHEM                               DE440
CLK                          TT(BIPM2019)
UNITS                                 TDB
START              56568.2121563848386458
FINISH             59421.4003363883475926
TIMEEPH                              FB90
T2CMETHOD                        IAU2000B
DILATEFREQ                              N
DMDATA                                  N
NTOA                                   10
CHI2                                  0.0
RAJ                      6:21:22.01017540 0 0.02108535635446286832
DECJ                    10:02:29.39636000 0 1.28655830483269539855
PMRA                   154.99933932091736 0 38.32827139688301
PMDEC                   853.9002881758452 0 161.0480808201861
PX                                    0.0
POSEPOCH           54999.9998161703821150
F0                  34.657407161944268063 0 5.8220569308e-10
F1              -6.8024291921798985053e-17 0 2.3431820241319046872e-18
PEPOCH             54999.9998161703821150
CORRECT_TROPOSPHERE                         N
PLANET_SHAPIRO                          N
NE_SW                                 7.9
SWM                                   0.0
DM                  36.567498299041078198 0 0.00099999995348440769
DMEPOCH            55000.0000000000000000
DMX                                   6.5
DMX_0001             -0.02286374033884119 1 0.004209721846073168
DMXR1_0001         56568.1621563848311806
DMXR2_0001         56568.2621565318113426
DMX_0002             0.004038234230037191 1 0.003956496936912034
DMXR1_0002         59421.3503361464245254
DMXR2_0002         59421.4503363883559375
BINARY                                 DD
PB                   8.319084026957771904 0 2.7466991161234e-07
PBDOT                                 0.0
A1                     12.032352334870641 0 0.00022070836169778854
A1DOT                                 0.0
ECC                  0.002384353926582889 0 3.976882262223037e-05
EDOT                                  0.0
T0                 55145.5433555870737050 0 0.08005422103296240617
OM                  182.54567292992324803 0 0.00396216643901861707
OMDOT              0.77923849834938746777 0 0.44091740076385298584
A0                                    0.0
B0                                    0.0
GAMMA                                 0.0
DR                                    0.0
DTH                                   0.0
TZRMJD             58092.0258119028393056
TZRSITE                             lofar
TZRFRQ                 167.28515859378945
"""

tim = """FORMAT 1
/data/kkma/postprocessing/data//LOFAR/J0621+1002/J0621+1002.2013-10-03-04:57.add_clean_calibP_tscr.ar 125.097656 56568.212156384518097 921.223  lofar  -fe unknown -be LOFAR -f unknown_LOFAR -bw 14.06 -tobs 899.81 -tmplt newtemplates/J0621+1002.ar -gof 1.07 -nbin 1024 -nch 72 -prof_snr 7.26 -sys LOFAR.150
/data/kkma/postprocessing/data//LOFAR/J0621+1002/J0621+1002.2013-10-03-04:57.add_clean_calibP_tscr.ar 139.200325 56568.212156531479532 922.440  lofar  -fe unknown -be LOFAR -f unknown_LOFAR -bw 14.06 -tobs 899.81 -tmplt newtemplates/J0621+1002.ar -gof 1.1 -nbin 1024 -nch 72 -prof_snr 10.82 -sys LOFAR.150
/data/kkma/postprocessing/data//LOFAR/J0621+1002/J0621+1002.2013-10-03-04:57.add_clean_calibP_tscr.ar 153.237640 56568.212156394256012 739.059  lofar  -fe unknown -be LOFAR -f unknown_LOFAR -bw 14.06 -tobs 899.81 -tmplt newtemplates/J0621+1002.ar -gof 1.07 -nbin 1024 -nch 72 -prof_snr 11.45 -sys LOFAR.150
/data/kkma/postprocessing/data//LOFAR/J0621+1002/J0621+1002.2013-10-03-04:57.add_clean_calibP_tscr.ar 167.250770 56568.212156415574501 440.814  lofar  -fe unknown -be LOFAR -f unknown_LOFAR -bw 14.06 -tobs 899.81 -tmplt newtemplates/J0621+1002.ar -gof 1.05 -nbin 1024 -nch 72 -prof_snr 9.46 -sys LOFAR.150
/data/kkma/postprocessing/data//LOFAR/J0621+1002/J0621+1002.2013-10-03-04:57.add_clean_calibP_tscr.ar 181.187528 56568.212156466085537 1236.402  lofar  -fe unknown -be LOFAR -f unknown_LOFAR -bw 14.06 -tobs 899.81 -tmplt newtemplates/J0621+1002.ar -gof 1.19 -nbin 1024 -nch 72 -prof_snr 5.98 -sys LOFAR.150
/data/kkma/postprocessing/data//LOFAR/J0621+1002/J0621+1002.2021-07-26-09:28.add_clean_calibP_tscr.ar 125.066372 59421.400336245320750 602.268  lofar  -fe unknown -be COBALT -f unknown_COBALT -bw 14.06 -tobs 899.56 -tmplt newtemplates/J0621+1002.ar -gof 1.07 -nbin 1024 -nch 72 -prof_snr 9.46 -sys LOFAR.150
/data/kkma/postprocessing/data//LOFAR/J0621+1002/J0621+1002.2021-07-26-09:28.add_clean_calibP_tscr.ar 139.279255 59421.400336197322133 416.165  lofar  -fe unknown -be COBALT -f unknown_COBALT -bw 14.06 -tobs 899.56 -tmplt newtemplates/J0621+1002.ar -gof 0.938 -nbin 1024 -nch 72 -prof_snr 11.99 -sys LOFAR.150
/data/kkma/postprocessing/data//LOFAR/J0621+1002/J0621+1002.2021-07-26-09:28.add_clean_calibP_tscr.ar 153.226188 59421.400336153624831 275.830  lofar  -fe unknown -be COBALT -f unknown_COBALT -bw 14.06 -tobs 899.56 -tmplt newtemplates/J0621+1002.ar -gof 0.9 -nbin 1024 -nch 72 -prof_snr 18.62 -sys LOFAR.150
/data/kkma/postprocessing/data//LOFAR/J0621+1002/J0621+1002.2021-07-26-09:28.add_clean_calibP_tscr.ar 167.261939 59421.400336146117558 266.152  lofar  -fe unknown -be COBALT -f unknown_COBALT -bw 14.06 -tobs 899.56 -tmplt newtemplates/J0621+1002.ar -gof 1.06 -nbin 1024 -nch 72 -prof_snr 15.27 -sys LOFAR.150
/data/kkma/postprocessing/data//LOFAR/J0621+1002/J0621+1002.2021-07-26-09:28.add_clean_calibP_tscr.ar 181.458939 59421.400336388027380 587.558  lofar  -fe unknown -be COBALT -f unknown_COBALT -bw 14.06 -tobs 899.56 -tmplt newtemplates/J0621+1002.ar -gof 0.981 -nbin 1024 -nch 72 -prof_snr 9.88 -sys LOFAR.150
"""

pint.logging.setup("WARNING")
print(f"# python_version: {sys.version}")
print(f"# numpy_version: {np.__version__}")
print(f"# astropy_version: {astropy.__version__}")
print(pint.utils.info_string(prefix_string="# "))

# dmx values that are gotten by D Kaplan and Krishnakumar
dmx_dlk = np.array([-0.022862539247389357, 0.0040382142830872585])
dmx_kk = np.array([-0.02286136775682614, 0.0040378097214163])

# m, t = get_model_and_toas("J0621+1002_kk_DMX_tdb.par", "J0621+1002_two-testToAs.tim")
m, t = get_model_and_toas(io.StringIO(par), io.StringIO(tim))
print(f"Free parameters: {m.free_params}")
f = pint.fitter.Fitter.auto(t, m)
f.fit_toas()
dmxoutput = pint.utils.dmxparse(f)
for i in range(len(dmxoutput["bins"])):
    print(
        f"{dmxoutput['bins'][i]}: ({dmxoutput['r1s'][i]}-{dmxoutput['r2s'][i]}): {dmxoutput['dmxs'][i]+dmxoutput['mean_dmx']}: {dmxoutput['dmx_verrs'][i]}"
    )
print(
    f"Differences wrt DLK: {(dmxoutput['dmxs'].value+dmxoutput['mean_dmx'].value-dmx_dlk)/dmx_dlk}"
)
print(
    f"Differences wrt KK: {(dmxoutput['dmxs'].value+dmxoutput['mean_dmx'].value-dmx_kk)/dmx_kk}"
)
dlakaplan commented 1 year ago

This doesn't depend on PINT version (we've mostly tried 0.9.5, the version on pypi).

paulray commented 1 year ago
% py testdk.py 
# python_version: 3.9.15 (main, Oct 12 2022, 03:01:28) 
[Clang 11.0.3 (clang-1103.0.32.62)]
# numpy_version: 1.24.3
# Created: 2023-06-08T11:27:30.289184
# PINT_version: 0.9.3+621.g6e6eb56d
# User: Paul Ray (paulr)
# Host: mcxr2.nrl.navy.mil
# OS: macOS-10.15.7-x86_64-i386-64bit
Downloading https://raw.githubusercontent.com/ipta/pulsar-clock-corrections/main/index.txt
|==========================================================================================================================================================================================| 3.5k/3.5k (100.00%)         0s
WARNING  (pint.logging                  ): /Users/paulr/env/pint/lib/python3.9/site-packages/erfa/core.py:154 ErfaWarning: ERFA function "dtf2d" yielded 1 of "dubious year (Note 6)"
Free parameters: ['DMX_0001', 'DMX_0002']
DMX_0001: (56568.16215638483 d-56568.26215653181 d): -0.022862539247389343 pc / cm3: 0.0009955822393405125 pc / cm3
DMX_0002: (59421.350336146425 d-59421.450336388356 d): 0.004038214283087271 pc / cm3: 0.0009955822393405125 pc / cm3
Differences wrt DLK: [-6.07009906e-16  3.00703813e-15]
Differences wrt KK: [5.12432404e-05 1.00193347e-04]
paulray commented 1 year ago

And on my Ubuntu linux box:

% py testdk.py 
# python_version: 3.8.10 (default, May 26 2023, 14:05:08) 
[GCC 9.4.0]
# numpy_version: 1.22.3
# Created: 2023-06-08T11:32:40.275747
# PINT_version: 0.9.3+417.g7387f337
# User: Paul Ray (paulr)
# Host: hesenode1
# OS: Linux-5.4.0-147-generic-x86_64-with-glibc2.29
Downloading https://raw.githubusercontent.com/ipta/pulsar-clock-corrections/main/index.txt
|==========================================================================================================================================================================================| 3.5k/3.5k (100.00%)         0s
Downloading https://raw.githubusercontent.com/ipta/pulsar-clock-corrections/main/T2runtime/clock/tai2tt_bipm2019.clk
|==========================================================================================================================================================================================|  36k/ 36k (100.00%)         0s
WARNING  (pint.logging                  ): /home/paulr/env/pint/lib/python3.8/site-packages/erfa/core.py:154 ErfaWarning: ERFA function "dtf2d" yielded 1 of "dubious year (Note 6)"
Downloading https://naif.jpl.nasa.gov/pub/naif/generic_kernels/spk/planets/de440.bsp
|==========================================================================================================================================================================================| 119M/119M (100.00%)         3s
Downloading https://maia.usno.navy.mil/ser7/finals2000A.all
|==========================================================================================================================================================================================| 3.5M/3.5M (100.00%)         1s
Free parameters: ['DMX_0001', 'DMX_0002']
DMX_0001: (56568.16215638483 d-56568.26215653181 d): -0.02286136775682614 pc / cm3: 0.0009955822393404809 pc / cm3
DMX_0002: (59421.350336146425 d-59421.450336388356 d): 0.004037809721416298 pc / cm3: 0.0009955822393404809 pc / cm3
Differences wrt DLK: [-5.12406146e-05 -1.00183309e-04]
Differences wrt KK: [-0.00000000e+00 -4.29619917e-16]
dlakaplan commented 1 year ago

So you are basically getting results that agree with my (Mac) results. Have you seen things like this before?

paulray commented 1 year ago

Not that I recall.

dlakaplan commented 1 year ago

I think this leads to a TOA difference of ~200ns at LOFAR frequencies

dlakaplan commented 1 year ago

@vallis or @AaronDJohnson: do you have any idea what this might be? Something in the precision with which the frequencies are stored, or a difference in how the double-double TOAs are handled?

dlakaplan commented 1 year ago

I checked m.components["DispersionDMX"].dispersion_time_delay between the two machines and they are the same.

dlakaplan commented 1 year ago

However, if I explicitly set the DMX to a single value I get different residuals on the two machines:

f.model.DMX_0001.value=-0.022862539247389357
f.update_resids()
print(f.resids.time_resids[0])

gives: Mac: -0.0017214943385095776 s NB server (linux): -0.0017213214324379833 s so these differ fractionally by 1e-4, or by ~170 ns

dlakaplan commented 1 year ago

Comparing the frequencies in the TOA table to high precision, they agree.

dlakaplan commented 1 year ago

Same with the barycentered TOA.

dlakaplan commented 1 year ago

Changing the frequency of the first TOA does not change the difference in residual between the two machine.

dlakaplan commented 1 year ago

I don't know if this is related, but when I do: t.table[0]['mjd'].value-t.table[0]['mjd_float'] I get 0 on a Mac, but something like 27us on the NB server. So an amount similar to the clock correction (but not exactly the same). But the tdbld values which are the ones actually used are the same. So I'm not sure why this is different but I don't think it's the cause.

JPGlaser commented 1 year ago

I did a fork and modified David's above script to run through GitHub Actions and push the results to a Google Sheet. You can find the results here:

https://docs.google.com/spreadsheets/d/1fafopRuFhQZMhlA1TfQt__jBoSeq6LcilvY-sKsDmHI/edit?usp=sharing

We see a difference between the Mac and Linux machines offered by GitHub in the DMX values at 10^-6.

~ Joe G.