Inconsistent results from current and older versions of toast3

I'm building and running the toast3 branch of this repository following the instructions presented HERE

When I build and run toast_benchmark_ground_setup from the upstream version of the toast3 branch for the tiny workload, I get the following output:

TOAST INFO: SolveAmplitudes begin building flags for solver
TOAST INFO: SolveAmplitudes  finished flag building in 0.01 s
TOAST INFO: SolveAmplitudes begin build of solver covariance
TOAST INFO: SolveAmplitudes  finished build of solver covariance in 2.35 s
TOAST INFO: SolveAmplitudes Solver flags cut 2858514 / 4800000 = 59.55% of samples
TOAST INFO: SolveAmplitudes begin RHS calculation
TOAST INFO: SolveAmplitudes  finished RHS calculation in 1.17 s
TOAST INFO: SolveAmplitudes begin PCG solver
TOAST INFO: MapMaker initial residual = 2.509321783569836e+22, 0.82 s
TOAST INFO: MapMaker iteration    0, relative residual = 7.149757e-01, 0.88 s
TOAST INFO: MapMaker iteration    1, relative residual = 2.869479e-03, 0.88 s
TOAST INFO: MapMaker iteration    2, relative residual = 4.844515e-05, 0.88 s
TOAST INFO: MapMaker iteration    3, relative residual = 7.128377e-06, 0.88 s
TOAST INFO: MapMaker iteration    4, relative residual = 7.177570e-08, 0.88 s
TOAST INFO: MapMaker iteration    5, relative residual = 7.596848e-10, 0.88 s
TOAST INFO: MapMaker iteration    6, relative residual = 2.283520e-11, 0.88 s
TOAST INFO: MapMaker iteration    7, relative residual = 5.122855e-22, 0.88 s
TOAST INFO: MapMaker PCG converged after    7 iterations and 7.87 seconds
...
TOAST INFO: Gathering benchmarking metrics.
TOAST INFO:
TOAST INFO: Science Metric (samples per node-second):  (4.800e+06) / (350.7 * 1) = 13687.81
TOAST INFO:
TOAST INFO:
TOAST INFO: Output statistics for case 'tiny':
  Total map hits = 4692640 (expected 4692640)
  Intensity map RMS = 3.832030562301031 (expected 4.288589272510856)
  Stokes Q map RMS = 0.255301642799548 (expected 0.2883310243187944)
  Stokes U map RMS = 0.252311140990647 (expected 0.28506926605174054)
TOAST INFO:
TOAST INFO: toast_benchmark_ground (gathering and dumping timing info):  0.01 seconds (1 calls)

The solver converges after seven iterations, and the resulting statistics don't match the expected values. Same result is obtained when building and running the version corresponding to commit 24431d12f82ae827176f47af4d13506988948127 (Oct 2023)

When I build and run toast_benchmark_ground_setup from an older version corresponding to commit 0992fe001cf20378204d835e9fa99b36c99ec181 (Feb 2023), the output is the following:

TOAST INFO: SolveAmplitudes begin building flags for solver
TOAST INFO: SolveAmplitudes  finished flag building in 0.22 s
TOAST INFO: SolveAmplitudes begin build of solver covariance
TOAST INFO: SolveAmplitudes  finished build of solver covariance in 2.92 s
TOAST INFO: SolveAmplitudes Solver flags cut 2185700 / 4692640 = 46.58% of samples
TOAST INFO: SolveAmplitudes begin RHS calculation
TOAST INFO: SolveAmplitudes  finished RHS calculation in 2.29 s
TOAST INFO: SolveAmplitudes begin PCG solver
TOAST INFO: MapMaker initial residual = 9.338666194102869e+22, 1.95 s
TOAST INFO: MapMaker iteration    0, relative residual = 5.565983e-01, 2.08 s
TOAST INFO: MapMaker iteration    1, relative residual = 3.564573e-02, 2.08 s
TOAST INFO: MapMaker iteration    2, relative residual = 1.951437e-02, 2.08 s
TOAST INFO: MapMaker iteration    3, relative residual = 5.582467e-03, 2.08 s
TOAST INFO: MapMaker iteration    4, relative residual = 2.123680e-03, 2.08 s
TOAST INFO: MapMaker iteration    5, relative residual = 1.531162e-03, 2.09 s
TOAST INFO: MapMaker iteration    6, relative residual = 2.296330e-04, 2.08 s
TOAST INFO: MapMaker iteration    7, relative residual = 1.190001e-04, 2.08 s
TOAST INFO: MapMaker iteration    8, relative residual = 1.888993e-05, 2.08 s
TOAST INFO: MapMaker iteration    9, relative residual = 6.536763e-06, 2.08 s
TOAST INFO: MapMaker iteration   10, relative residual = 3.907494e-06, 2.08 s
TOAST INFO: MapMaker iteration   11, relative residual = 1.186140e-06, 2.08 s
TOAST INFO: MapMaker iteration   12, relative residual = 8.212098e-07, 2.08 s
TOAST INFO: MapMaker iteration   13, relative residual = 6.543912e-07, 2.08 s
TOAST INFO: MapMaker iteration   14, relative residual = 6.831499e-07, 2.08 s
TOAST INFO: MapMaker iteration   15, relative residual = 4.209891e-07, 2.08 s
TOAST INFO: MapMaker iteration   16, relative residual = 4.589773e-07, 2.08 s
TOAST INFO: MapMaker iteration   17, relative residual = 1.271750e-07, 2.08 s
TOAST INFO: MapMaker iteration   18, relative residual = 1.241561e-07, 2.09 s
TOAST INFO: MapMaker iteration   19, relative residual = 5.115972e-08, 2.09 s
TOAST INFO: MapMaker iteration   20, relative residual = 4.645164e-08, 2.08 s
TOAST INFO: MapMaker iteration   21, relative residual = 6.500606e-08, 2.08 s
TOAST INFO: MapMaker iteration   22, relative residual = 5.718058e-08, 2.08 s
TOAST INFO: MapMaker iteration   23, relative residual = 4.763475e-08, 2.08 s
TOAST INFO: MapMaker iteration   24, relative residual = 5.019159e-08, 2.09 s
TOAST INFO: MapMaker iteration   25, relative residual = 7.186621e-08, 2.08 s
TOAST INFO: MapMaker iteration   26, relative residual = 5.203379e-08, 2.08 s
TOAST INFO: MapMaker iteration   27, relative residual = 9.239255e-08, 2.08 s
TOAST INFO: MapMaker iteration   28, relative residual = 3.212252e-08, 2.09 s
TOAST INFO: MapMaker iteration   29, relative residual = 5.474001e-08, 2.08 s
TOAST INFO: MapMaker iteration   30, relative residual = 3.864716e-08, 2.09 s
TOAST INFO: MapMaker PCG stalled after   30 iterations and 66.50 seconds
...
TOAST INFO: Gathering benchmarking metrics.
TOAST INFO:
TOAST INFO: Science Metric (samples per node-second):  (4.800e+06) / (245.9 * 1) = 19516.18
TOAST INFO:
TOAST INFO:
TOAST INFO: Output statistics for case 'tiny':
  Total map hits = 4692640.0 (expected 4692640.0)
  Intensity map RMS = 4.288586574948201 (expected 4.288589272510856)
  Stokes Q map RMS = 0.2883550550947974 (expected 0.2883310243187944)
  Stokes U map RMS = 0.2850846448681744 (expected 0.28506926605174054)
TOAST INFO:
TOAST INFO: toast_benchmark_ground (gathering and dumping timing info):  0.01 seconds (1 calls)

Here, the solver finishes in 30 iterations, and the resulting statistics match the expected values.

I also note that the wall time per iteration is less than half in the current version compared to the older commit.

Can you please confirm that the current version of toast3 is running correctly?

hpc4cmb / toast

Inconsistent results from current and older versions of toast3 #715