Bugs in destriper framework

ggalloni commented 4 months ago

I am working on the data splits for the destriper (#309 ) and I encountered a couple of bugs.

In the report template for the destriper, I added a reference to the filename of the figure, so that it shows correctly in the report.
There seems to be an issue in the case one uses samples_per_baseline=None (so the destriping procedure is not performed) while asking append_to_report = True, since it tries to build the convergence plot.
I am running into an error running save_destriper_results: within _save_baselines, when it tries to save the errors, cur_error[idx_det, :] generates an index error. I am not familiar with the algebra here, but from the code, I think that the first dimension of cur_error should be the number of observations and not the number of detectors. Is that so? If so, probably substituting the index with idx (which runs over observations) is sufficient (?)

I already addressed point 1. in #309, and probably I will do the same for 2., but probably the third needs some deeper thinking. What do you think @ziotom78 @paganol?

ggalloni commented 4 months ago

An update on point 3. :

Indeed, the function computing the errors (compute_weighted_baseline_length) expects to run over the detector dimension. However, the actual result of that is summed over the weights of every detector. Thus, in principle, it will create N_det repetitions of the same array.

So I would rather modify that function to return a unique array that is then used for the preconditioner, so we can save some memory. This means that I would also change the way this quantity is saved in save_detriper_results. As for the other bugs, I will address this one within #309.

ziotom78 commented 4 months ago

Hi @ggalloni , thanks a lot for the tests and the fixes!

I am running into an error running save_destriper_results: within _save_baselines, when it tries to save the errors, cur_error[idx_det, :] generates an index error. I am not familiar with the algebra here, but from the code, I think that the first dimension of cur_error should be the number of observations and not the number of detectors. Is that so? If so, probably substituting the index with idx (which runs over observations) is sufficient (?)

Is the error happening in this part of the code?

for cur_baseline, cur_error, cur_lengths in zip(
  results.baselines, results.baseline_errors, results.baseline_lengths
):
    baseline_hdu = fits.BinTableHDU.from_columns(
        [
            fits.Column(
                name=f"BSL{det_idx:05d}",
                array=cur_baseline[det_idx, :],
                format="1E",
                unit="K",
            )
            for det_idx in range(cur_baseline.shape[0])
        ]
    )

   # Etc.

Indeed, result_baseline_errors is a list with one member per observation, but cur_error is the i-th element in the list, so it should already be a NumPy array with the first rank running over the detectors. Isn't it so?

ggalloni commented 4 months ago

Apparently result.baselines has the shape you describe, but results.baseline_errors does not. As far as I understood the latter has shape Nobs x Nbaselines. So, cur_error is one dimensional and it was breaking when sliced with [det_idx, :].

Looking at the code producing the errors this actually makes sense given that the noise is summed over detectors in compute_weighted_baseline_length, so it doesn't make much sense to store the same array Ndetectors times. Am I misunderstanding something?

ziotom78 commented 4 months ago

Mmm, I would like to see your code. I ran the test test_destriper_io (in test/test_destriper.py) and I checked that the structure desired_results (line 923) has this content:

desired_results.baselines = [
    np.array([[-0.06836026 -0.23471543 -0.29916509  0.3306597   0.27158108]]),
]

desired_results.baseline_errors = [
    np.array([[2.15951183 1.82564974 1.79309935 2.24839557 2.21996142]]),
]

So it seems that sometimes the errors follow one memory layout (as in the test above), sometimes another (as in your code).

May you please share the code you are using where you see this behavior? It might be that some part of the destriper messes up the layout of the errors, but this is not caught by the test.

ggalloni commented 4 months ago

How many detectors are there in that test? I guess just one? If so, the code works as-is. This is just because det_idx in the _save_baselines will just be 0. With two dets for example, the first loop works, but when it evaluates cur_error[1, :] it breaks.

I am using a setup very similar to the test_full_destriper above the IO test. The only difference is that I am simulating one year, and I am using the IMO to define detectors, scanning strategy etc. Also, I am not stimulating any foreground as I am running with noise-only maps (1/f included). If what I say is right, that test should fail if one attempts to use save_destriper_results, since there are 2 dets and the errors will not propagate that dimension.

Still, I will build a minimal code to reproduce this.

ggalloni commented 4 months ago

Yes, indeed using that test as a template and saving the results at the end breaks. Here, is a code to reproduce it. On the master it breaks, while on the branch of #309 it works.

from pathlib import Path

import astropy
import litebird_sim as lbs
import numpy as np
from litebird_sim.mapmaking import save_destriper_results

nside = 32

sim = lbs.Simulation(
    base_path="test_errors_dimension",
    start_time=0,
    duration_s=astropy.time.TimeDelta(10, format="jd").to("s").value,
    random_seed=12345,
)

sim.set_instrument(
    lbs.InstrumentInfo(
        name="Dummy", boresight_rotangle_rad=np.deg2rad(50), hwp_rpm=46.0
    )
)

dets = [
    lbs.DetectorInfo(
        sampling_rate_hz=1.0,
        name="A",
        fwhm_arcmin=20.0,
        bandcenter_ghz=140.0,
        bandwidth_ghz=40.0,
        net_ukrts=50.0,
        fknee_mhz=50.0,
        quat=np.array([0.02568196, 0.00506653, 0.0, 0.99965732]),
    ),
    lbs.DetectorInfo(
        sampling_rate_hz=1.0,
        name="B",
        fwhm_arcmin=20.0,
        bandcenter_ghz=140.0,
        bandwidth_ghz=40.0,
        net_ukrts=50.0,
        fknee_mhz=50.0,
        quat=np.array([0.0145773, 0.02174247, -0.70686447, 0.70686447]),
    ),
]

sim.set_scanning_strategy(
    scanning_strategy=lbs.SpinningScanningStrategy(
        spin_sun_angle_rad=np.deg2rad(45.0),
        precession_rate_hz=1 / 10_020.0,
        spin_rate_hz=1 / 60.0,
    ),
)

sim.create_observations(
    detectors=dets,
    num_of_obs_per_detector=sim.mpi_comm.size,
)

assert len(sim.observations) == 1

sim.set_hwp(
    lbs.IdealHWP(
        sim.instrument.hwp_rpm * 2 * np.pi / 60,
    ),  # applies hwp rotation angle to the polarization angle
)
sim.compute_pointings()

lbs.add_noise_to_observations(
    obs=sim.observations,
    noise_type="one_over_f",
    scale=1,
    random=sim.random,
)

destriper_params_noise = lbs.DestriperParameters(
    output_coordinate_system=lbs.coordinates.CoordinateSystem.Galactic,
    samples_per_baseline=100,  # ν_samp = 1 Hz ⇒ the baseline is 100 s
    iter_max=10,
    threshold=1e-6,
)

destriper_result = lbs.make_destriped_map(
    nside=nside, obs=sim.observations, pointings=None, params=destriper_params_noise
)

save_destriper_results(
    output_folder=Path("test_errors_dimension"),
    results=destriper_result,
)

ziotom78 commented 4 months ago

That's terrific, thank you very much! I have added a new test using your script in #309, so we'll never miss this issue again.

litebird / litebird_sim

Bugs in destriper framework #310