scipp / essimaging

Imaging data reduction for the European Spallation Source
BSD 3-Clause "New" or "Revised" License
0 stars 1 forks source link

[Requirement] Tiff image stack merging #29

Open YooSunYoung opened 3 weeks ago

YooSunYoung commented 3 weeks ago

Executive summary

Merge tiff image stack of timepix

Context and background knowledge

Timepix daq (Not ESS filewriter, written by Rosco Vision) bins the data into wavelength frames as it saves the file. And it saves the binned image every N minutes (depending on the settings). As experiments typically last a few hours (Again, not at ESS, but with other sources) we need to merge the image stacks.

Inputs

For example, if the image is binned into 1_000 frames, and there are 1_024 x 1_024 pixels, each image stack (each file) has a shape of

{
   "frame": 1_000,
   "x": 1_024,
   "y": 1_024,
}

Methodology

It simply needs to add all frames. Ideally, we load the stack all at once, but we also had to use a machine that lacks memory. So we also needed a lazy-load option.

Here is the script we used(I was too slow to write this one so we didn't really use this particular script but it's close enough to the one we actually used) at the experiment at SENJU in June, 2024.

# SPDX-License-Identifier: BSD-3-Clause
# Copyright (c) 2024 Scipp contributors (https://github.com/scipp)
import argparse
from PIL import Image
import pathlib
import numpy as np
from tifffile import imread, imwrite
import scipp as sc
from tqdm import tqdm

def parse_arg() -> argparse.Namespace:
    parser = argparse.ArgumentParser(description="Merge Timepix files")
    parser.add_argument(
        "--lazy-load", action="store_true", help="Lazy load the data", default=False
    )
    return parser.parse_args()

def merge_files_at_once(files: list[pathlib.Path]) -> np.ndarray:
    merged = np.zeros((1_000, 1_024, 1_024), dtype=np.uint8)
    for file in files:
        data = imread(file)
        merged += data

    return merged

def build_empty_merged(
    n_frame: int = 1_000, n_x: int = 1_024, n_y: int = 1_024
) -> sc.DataArray:
    return sc.DataArray(
        data=sc.zeros(dims=["frame", "x", "y"], shape=[n_frame, n_x, n_y]),
        coords={  # Maybe coords should be bin-edges ...?
            "frame": sc.arange(dim="frame", start=0, stop=n_frame),
            "x": sc.arange(dim="x", start=0, stop=n_x),
            "y": sc.arange(dim="y", start=0, stop=n_y),
        },
    )

def single_frame_to_data_array(img: Image.Image) -> sc.DataArray:
    return sc.DataArray(
        data=sc.array(dims=["x", "y"], values=np.array(img, dtype=int)),
        coords={
            "x": sc.arange(dim="x", start=0, stop=1_024),
            "y": sc.arange(dim="y", start=0, stop=1_024),
        },
    )

def merge_files_lazy(files: list[pathlib.Path]) -> np.ndarray:
    num_frame = 1_000
    merged = build_empty_merged()

    image_file_handles = {file: Image.open(file) for file in files}
    unexpected_eof = {}
    for i_frame in tqdm(range(num_frame), desc="Processing frames"):
        for file_path, img in image_file_handles.items():
            try:
                img.seek(i_frame)
                merged['frame', i_frame] += single_frame_to_data_array(img)
            except (  # noqa: PERF203
                TypeError,
                EOFError,
            ):
                if file_path not in unexpected_eof:
                    unexpected_eof[file_path] = i_frame
                continue

    if unexpected_eof:
        print("Unexpected EOF at frames \n", unexpected_eof)

    return merged.data.values

if __name__ == "__main__":
    args = parse_arg()

    files = [
        # If you want to add range of files
        pathlib.Path(f"Run_{str(i_run).rjust(6, '0')}/final/image")
        for i_run in range(25376, 25378)
    ]
    files += [
        # if you want to add more files
    ]

    if args.lazy_load:
        merged = merge_files_lazy(files)
    else:
        merged = merge_files_at_once(files)

    # Save the merged file
    imwrite("merged.tiff", merged)

Outputs

Single tiff file that has image stack. It should have the same shape as the single input image.

Which interfaces are required?

Integrated into reduction workflow, Python module / function, Python script, Jupyter notebook

Test cases

Any multi-frame tiff images.

Comments

No response

nvaytet commented 2 weeks ago

Should this be handled in a similar way to merging events from multiple runs in SANS? https://github.com/scipp/esssans/pull/135/files#diff-04944519a0e85d971f5052bfc0b4d1dc46d18210a2a58fee8d1ddfc2e2a10fe6R26