Running ALS with thin resources

fuherb commented 1 year ago

This is spinned off from #169.

This evening I tested in the field with the old netbook (with swapfile) and version 0ebcfa6. I stacked total 50 subs each 5 second on M42. All are stacked successfully and the netbook survive without hanging or crash. So this is a success.

Nevertheless the speed is extremely slow, the cycle time to stack 1 frame is about 55s. So a 5s sub take 55s to process. The image is 2028x1520 in size. This is impractical. Certainly not a fault of ALS. Currently I am running ASTAP in this netbook and the cycle time is about 15s. The quality of the stack and stretched image is much better coming out from ALS, so the comparison is not a straightly fair one.

It appears that the bottleneck is in "stack" as the queue is there. Anyhow I still would like to explore possibilities in speeding up. The question is: If I disable the level and RGB process (uncheck the box), leaving only the stretch box checked, will this speed up the process? For EAA, color balance is not that much of a concern.

Another observation in the field is that I have no way to zoom in/out the image. The doc says use the mouse wheel. But on the netbook or laptop etc there is no mouse wheel. Could we add some key strokes to enable zoom in/out (eg. PgUp/PgDn or arrows).

deufrai commented 1 year ago

Hi

nice to hear we output better images than ASTAP :)

Regarding Levels & color processes : even switched ON, if they are set to their default values, no processing is done, so no time uselessly spent there.

as for global speed-ups, this is our main goal and we are still searching for ways to improve.

Stacking in itself is done in 3 steps

compare new image with stack, find needed similarities and decide what transformations are to be applied for aligning (translation + rotation + scale)
apply the aligning to new image with transformations devised in step 1
perform the mean or sum

Step 1 is quite fast and I think we already reached our best by not searching similarities on the whole image at first, for example Step 2 is CPU hungry and we try and gain time by processing the 3 channels of a color image in parallel on 3 CPU cores (At least when there is a gain to it. For example, on Windows, mutliprocessing is much slower right now) Step 3 is blazing fast

I'd like to see the logs for this session of yours, so I can tell where your netbook spends most of its time.

Regarding zooming, for machines without a mouse and with a trackpad that cannot emulate mousewheel, I can surely add keyboard shortcuts for zooming. I'll create an issue for that

fuherb commented 1 year ago

Please see the log file for this session. Hopefully there are some way to speed it up.

The stretching fuction in ASTAP is very simple but difficult to adjust in a slow netbook, so it is not easy to get a nice view. Hence I say the quality as a EAA output from ALS is better. I think in terms of alignment etc you guys are on par (or I do not have the ability to judge further). als_20230307.log

fuherb commented 1 year ago

Another question. Is step 2 in stacking a pixel by pixel activity? Now I am doinga 2x2 binning with a theoretical sampling rate of 2 arcsec/pixel. Going to 3x3 binning should not hurt much, and pixel counts drop Another 50%. Maybe can gain speed by 50%?

在 2023年3月7日週二 22:58，Frederic CORNU @.***> 寫道：

Hi

nice to hear we output better images than ASTAP :)

Regarding Levels & color processes : even switched ON, if they are set to their default values, no processing is done, so no time uselessly spent there.

as for global speed-ups, this is our main goal and we are still searching for ways to improve.

Stacking in itself is done in 3 steps

compare new image with stack, find needed similarities and decide what transformations are to be applied for aligning (translation + rotation

scale)

apply the aligning to new image with transformations devised in step 1

perform the mean or sum

Step 1 is quite fast and I think we already reached our best by not searching similarities on the whole image at first, for example Step 2 is CPU hungry and we try and gain time by processing the 3 channels of a color image in parallel on 3 CPU cores (At least when there is a gain to it. For example, on Windows, mutliprocessing is much slower right now) Step 3 is blazing fast

I'd like to see the logs for this session of yours, so I can tell where your netbook spends most of its time.

Regarding zooming, for machines without a mouse and with a trackpad that cannot emulate mousewheel, I can surely add keyboard shortcuts for zooming. I'll create an issue for that

— Reply to this email directly, view it on GitHub https://github.com/gehelem/als/issues/170#issuecomment-1458318928, or unsubscribe https://github.com/notifications/unsubscribe-auth/AXHR265A42AFPJO22FYHL73W25EIZANCNFSM6AAAAAAVSR7DXQ . You are receiving this because you authored the thread.Message ID: @.***>

deufrai commented 1 year ago

pretty sure that going to 3x3 will improve processing times how much so ? only you can tell :)

fuherb commented 1 year ago

Hi, I dig into the log files a bit comparing the case when I simulate dumping files 1 by 1 into the folder and the real field situation. I note that in the simulation case, the stack finished in about 15s while in the field 45s. The file sizes in both cases are the same. That bring me to the thought that in the major difference between the 2 cases is that in the "simulation case" I dump the file one by one in a way that I wait for the last file to complete stacking (ie I saw the number raised by 1) before dumping the new one. In the field the camera just dump the file when it finish capturing so a queue piles up. Very likely this forces the computer to spare time to handle and pre-process these new files and slow down the whole process.

I am just guessing. But is there a way to force the computer to focus on stacking while new files come in and only returns to handle them after stacking the one WIP?

fuherb commented 1 year ago

Sorry to jump in again. Using an outdated hardware to run your program is my problem, it is not fair to ask you to make very specific changes just for this. It is better for you to concentrate on making the software better for the mass with appropriate hardwares. So if there is not much you can do please just ignore it. Eventually I should upgrade my computer when I get the money. Thanks a lot for this nice program and your support.

deufrai commented 1 year ago

What you said reminds me of this commit https://github.com/gehelem/als/commit/37e2d23 Where we may have made poor choices by giving top priority to post-processing (from stretch to color balance)

But in a EAA context, what we want and love is seeing the picture getting better and better. And whatever process you end up waiting for, you're still waiting :) But the experiment deserves a shot for sure.

I'm currently working on a log extraction tool so each of us can check where her own specific setup can gain time. I ran it on the last log file you provided : Have a look here => https://als-app.org/assets/support/issue_170/als_20230307_report/

With enough data, we may be able to make a better priority system that suits everyone, or in the end, making those priorities user-defined :)

fuherb commented 1 year ago

Very good info from the log extraction tool. I cannot fully agree with your view and would like to suggest some alternatives.

I agree that in EAA we would like to see the evolution of the image in a close to real time fashion. There are however different camps of thoughts. I tend to use very short subs like 5s, partly due to the weakness in my tracking mounts. Every additional stacked sub certainly improve the S/N ratio of the image, but the incremental effect is very small especially when the stacked counts is already large. Under this context, it might not be important to post process every stacked image. If we skip to only post process every 2 or 3 stack images, then we save a lot of resources. The evolution of the image can still be rather amazing.

There are of course other folks who use long subs even for EAA, maybe more than 30s. In such cases there can be noticeable improvement per every stack step.

As the developer you might opt to "freeze" the workflow so that you can serve the normal users, which should be the biggest group. Or you can provide the options to the user to plan their workflow to cater for non-standardized situations. These are interesting but difficult decisions.

deufrai commented 1 year ago

one of the difficulties of this project : user base is split in 2 large groups with different needs :

EEA guys who drop 1 frame every couple seconds or even less
Astrophotographers who want to monitor their ongoing shooting session, with 1 frame drop every minutes or even way more

Having a simple tool that suits both groups is an illusion. We are still in the process of really understanding everyone's need, while trying to add features or changes that will allow everyone to get the best. The information and details you gave are valuable

fuherb commented 1 year ago

A quick update on the status, using your latest build fa0f54c. 3x3 binning is not available for Raspberry Pi HQ camera. Instead I opted for a mode of 1330x990 with 10 bit color depth. The file size is 1/3 of previous. The ROI is reduced so I am not using the whole sensor. Nevertheless the FOV captured is still reasonable.

With these the netbook manage about 11s per frame, compared to 55s before. The 11s was obtained with simulation by manual dumping. In the field with WiFi file transfer etc I guess I will end up around 15s. This is a great improvement. So I can now declare ALS to my principle EAA software. Thanks for the good work.

deufrai commented 1 year ago

good to hear keyboard shortcuts for zooming are in that build too. Did you get a chance to use them ?

fuherb commented 1 year ago

Oh, I only see zoom in/out in manual, which key stroke can I use?

在 2023年3月12日週日 22:25，Frederic CORNU @.***> 寫道：

good to hear keyboard shortcuts for zooming are in that build too. Did you get a chance to use them ?

— Reply to this email directly, view it on GitHub https://github.com/gehelem/als/issues/170#issuecomment-1465213409, or unsubscribe https://github.com/notifications/unsubscribe-auth/AXHR264MTSHYJN5JVGVUEUDW3XMG5ANCNFSM6AAAAAAVSR7DXQ . You are receiving this because you authored the thread.Message ID: @.***>

deufrai commented 1 year ago

shortcuts are displayed by the Image menu new entries

y = zoom in h = zoom out b = fit to view

deufrai commented 1 year ago

this 2679b2f3dc04fd2afa5c3ce18356cb5ce50dfa9d has just landed on release/0.7 and might interest you

it will be built in a few hours

fuherb commented 1 year ago

Conducted a quick test of 2679b2f, only via simulation by manual dumping of files into watch folder. For sure I selected the "visual" mode.

Using the reduced file size (1330x990), the overall feel is the stacking run smoother. Looking at the process time displayed it still stay about 10-11s per frame.

Using the original file size (2028x1520), it is still slow but the stack seems come a little bit faster.

Will test again in the field when sky is clear. Nevertheless I think this is a good move.

gehelem / als

Running ALS with thin resources #170