FlatStruct is extremely inefficient

Cybis320 commented 1 month ago

I'm testing use_flat: true on a 1080p capture. Flux is unable to proceed because it runs out of memory (All 32 GB of RAM and 18 GB of swap!). Looking into it, the FlatStruct class is creating multiple Float64 copies of the image array from the flat.bmp. It takes a 2MB images and produces a 33MB object. This breaks the pickling pipeline. How widely used is flat? Is Float64 necessary anywhere? Is it worth rewriting the FlatStruct class?

dvida commented 1 month ago

Agreed, it should be rewritten.

The flat should note be operationally used with RMS cameras. The vignetting coefficient takes care of everything needed for these wide-field lenses, and making a truly good flat file is very tricky anyway (it's not the automatically generated flat.bmp).

Cybis320 commented 1 month ago

Right, I was confused by that. A true flat should be produced in a controlled environment with an extremely uniform surface. What is the auto generated flat then? Should use_flat even be in config?

dvida commented 1 month ago

The terminology has been adopted from another professional system, where making a flat this way really results in a good flat. The GMN "flats" are mostly used to understand the sky brightness and locate obstructions.

If used with very narrow lenses and under dark skies, good flats would be produced by the current method.

Cybis320 commented 1 month ago

That makes sense on long lenses. I probably can't get to refactor it real soon. Opinion on float64? Necessary?

dvida commented 1 month ago

Float is necessary for the computations because the levels are multiplied and divided. However, outside the places where calculations are done, it can be stored as an integer.

However, it is very important to check everywhere in the codebase what is happening to the flat and if there are any assumptions anywhere about its type.

Cybis320 commented 1 month ago

Float is necessary for the computations because the levels are multiplied and divided. However, outside the places where calculations are done, it can be stored as an integer.

However, it is very important to check everywhere in the codebase what is happening to the flat and if there are any assumptions anywhere about its type.

Thanks Denis, I'll keep looking into it. The FlatStruct object currently stores two float64 image arrays: flat_img and flat_img_raw. However, only flat_img is ever needed outside the class definition and it's only used by the applyFlat function and in skyfit but just to check the flat shape. So, right there it seems it's already twice the size it needs to be.

The flat and dark images are stored as bmp (only in 8-bit I think? but the code is meant to handle higher bit as needed), so we are working of quantized data to begin with. I understand the desire to keep the precision after operating on 8-bit data but 64-bit is probably overkill.

It looks like the only thing using FlatStruct.flat_img is applyFlat. So the FlatStruct object is applied to a low bit img and the applyFlat returns a low bit img.

At a minimum, only a single array should be stored. And that single array should probably only be 16 or 32 bit.

I updated the PR #393 to make all the calculations in float32 if dealing with 8-bit data, and float64 if dealing with 16-bit data. The FlatStruct object now only store one array (either 32 or 64 bit) and the array is the inverse of the flat with the dark applied for faster calculation in applyFlat.

CroatianMeteorNetwork / RMS

FlatStruct is extremely inefficient #392