ultramango / gear360pano

Simple script to create equirectangular panorama by stitching images from Samsung Gear 360
MIT License
225 stars 55 forks source link

Speed improvements #24

Open neuhaus opened 7 years ago

neuhaus commented 7 years ago

Can we keep an issue open on the topic of speed improvements? Currently stitching takes around 12 seconds per image which seems ... less than ideal. 🐌

ultramango commented 7 years ago

This might be tricky as the speed depends solely on the underlying tools (Hugin).

From what I can see the majority of time is taken by enblend and anything that speeds things up will decrease quality (still to be checked).

neuhaus commented 7 years ago

Could a ramdisk for the temporary files provide a substantial benefit?

ultramango commented 7 years ago

My gut feeling (1) tells me that not really.

I see some possible improvements using openMP/CUDA/whatever and/or some custom blending mask(s) or something similar, but then, in the latter case, you might loose quality - to be checked.

Playing with options requires time and understanding what you're doing (I lack knowledge in this area) plus I'd be good to repeat those for various photos.

Perhaps using a different software (for blending) would be a solution.

BTW: try adding "--no-optimize" to enblend :) - now it's fast.

(1) As far as I know enblend already makes good use of memory, so we're talking about reading the photo files which are "nothing".

neuhaus commented 7 years ago

Have you tried "nona -g" to perform the image remapping on the GPU?

ultramango commented 7 years ago

At some point yes, but it occassionally (always?) crashed (Arch Linux, NVidia). And it is not supported on all machines.

neuhaus commented 7 years ago

What about multiblend? It's a "faster drop-in replacement for enblend" (that lacks some features). It doesn't seem to optimize seams. I guess we need that feature, right?

ultramango commented 7 years ago

At somepoint I will have to explore those possibilities :) .

Using any other software would be OK if the speed gain would be substantial, otherwise I prefer not to add another dependency.

ftoledo commented 7 years ago

I think that the best solution must be using hardware acceleration. Unfortunately this do not work fine at hugin/enbled on GNU/Linux =(

evertvorster commented 7 years ago

I have had some success with hardware accell on nona. Right now I am going to test multiblend. The enblend's feature of picking the best seam makes for terribly jumpy video. I am also trying several different things with the scripts. They are only being tested in Arch Linux, so once I have a solution I am happy with I will quite gladly share it back. How do I contact ultramango? My mail is evorster at the usual gmail servers.

neuhaus commented 7 years ago

@evertvorster It's best if you fork the repository, push a new branch with your changes there and then do a pull request in this repository.

evertvorster commented 7 years ago

Hi neuhaus. Currently I am behind a very restrictive firewall, with no vanilla git access. I'll keep on testing, and then once I am home, then I'll educate myself on how to follow your suggestion -Evert-

ultramango commented 7 years ago

ultramango123 at gmail, but yes, pull request will be better (no worries, I'm still learning whereabouts of git, usually I end up with screwed up repos :) ).

Few points for changes:

As for the enblend - I still have in my plans discovering options that would speed up enblend and make it stitch the images in a static way (no fancy computing), just as @evertvorster mentioned in #30. Possibly simply using multiblend might be an easier solution (for the moment).

kwahoo2 commented 6 years ago

Other idea is using GNU Parallel on Linux (not sure if there is a Windows equivalent?).

Replacing the stitching code with:

export -f print_debug
export -f run_command  
ls -1 $FRAMESTEMPDIR/*.jpg | parallel --bar run_command "$DIR/gear360pano.cmd" -m -o "$OUTTEMPDIR" {} "$PTOTMPL"

I got 0.18s per frame on Ryzen 1700.

ultramango commented 6 years ago

This is impressive! I knew about GNU Parallel but somehow didn't believe (1) it to speed-up the process so much.

Thanks for a ready-to-use recipe. I added support in latest commit. I throttled "parallelism" to 80% as 100% killed my machine (a bit longer video). And still I think it might be "risky". There's a PARALLELTHROTTLE variable at the beginning of the script if you want to get back the extra 20%.

The usual boilerplate (for anyone-else): your mileage might vary, core-count, SSD vs HD, memory - they will affect how much you gain.

(1) in the back of my head I assumed the whole process makes good use of CPU already, but then it came to me that multiblend is single-threaded.

kwahoo2 commented 6 years ago

Parallel has some clever options for limiting resources https://www.gnu.org/software/parallel/parallel_tutorial.html#Limiting-the-resources

ultramango commented 6 years ago

Thanks, this is definitely better then the initial throttling (commited).

ole-tange commented 6 years ago

(not sure if there is a Windows equivalent?).

GNU Parallel is tested now and then on CygWin.

ultramango commented 6 years ago

For now I'd prefer to avoid CygWin as much as possible (despite its benefits). The objection I have is its non-user-friendliness.

Edit: on the second thought - maintaining Windows part (in batch) is pretty painful (at the beginning it was a nice challenge) and perhaps moving to CygWin would solve that problem.

davclark commented 6 years ago

parallel is also available via Windows subsystem for Linux (Bash on Windows) - at least on Ubuntu. I imagine other distros as well.