Tw1ddle / geometrize-haxe

:triangular_ruler: Geometrize is a Haxe port of primitive that geometrizes images into geometric primitives
https://www.geometrize.co.uk/
Other
348 stars 31 forks source link

Performances improvement proposal #17

Open Cerdic opened 5 years ago

Cerdic commented 5 years ago

Hi, Thanks to all your work on Geometrize!

I'm working on a PHP version of geometrize, based on the Haxe-generated PHP code https://github.com/Cerdic/geometrize-php with a focus on performance issue as this is in my case a real issue: I plan to compute on the fly low-weight preview svg images as a placeholder for images in web pages.

As a feedback I can point some optimizations I found that are imho applicable to this version:

The last optimization is a bit tricky and maybe not so interresting for a general use:

I added a rescale() method to shapes https://github.com/Cerdic/geometrize-php/commit/995e206534166960c652809040ded0962233322a allowing to start the rendering with a super small thumbnail (ie 64px) and increasing it progressively to 128px, 256px… depending of the final precision I want.

In my case this is a good optimisation as first steps generate large shapes that are slower to compute.

cancerberoSgx commented 5 years ago

Hey! glad to know there's another library port from the haxe code. I'm the author of this JavaScript library generated from haxe code like yours: https://www.npmjs.com/package/geometrizejs

And I'm also concerned about performance and lately giving some thoughts.

I would like to share our experience regarding our libraries and see what we decided or worked differently. I'm describing mine on some aspect and if you can comment on each about yours it would be awesome!

generating the library

Before this, haxe was totally unknown for me. But just in case ask the author if it would be possible to generate JavaScript that could be consumable as a library since he has a web demo but not usable as a library.

@Tw1ddle gently point me out the classes it should be exported and was just a matter of declaring that with annotations and execute the compiler to generate a .js file ready to be used, from node.js or from browser, as an API, using the same names as in haxe project.

how hard / automatic is to re-generate new library versions

In the case of https://www.npmjs.com/package/geometrizejs I focus particularly on

Both things were accomplished basically just providing types (TypeScript) for the library API and don't implementing anything (the library actually does not have any implementation). API docs were copy&pasted from haxe code and that could be something the require manual work although there are some tools that could help here too. The library has some tutorials and getting started working examples for node and browser (using js libraries to load the bitmap). Also it has tests that basically verify the happy path and the exporters don't break.

extra code

I wonder if you needed to hack generated code to make it work or use any special compiler flag? or if you change something because of performance ? In case you do, I wonder how do you perform the changes, is it changes on haxe code, or on generated code? or if you use haxe macros to generate different output

performance

I can think on a couple of performance improvements and I'm glad to see is none of your proposals which seems to be related with the primitive algorithm.

It would be awesome to have a PR with yours implemented or better a one PR for each, so we can measure the impact.

One thing I think it can be improved is how Bitmap data is handled.For example:

performance measurement in different language targets

What worries me, and now with a PHP sister, is how to verify that changes that are good for JavaScript don't impact badly in PHP or C++ targets... Since I'm still new with haxe I don't want to start hacking/micro optimizing the code without having a tool to measure the impact on other targets.

So I written this simple application https://github.com/Tw1ddle/geometrize-haxe-unit-tests/tree/performance-lime/tests-performance that's a working haxe program using it as a library, loading/parsing an image to a bitmap, iterating and exporting. I didn't have time to continue but my objectives are:

target language custom code

If really important for tuning performance probably we will need to change the target language code that haxe compiler actually generates. For example, in JavaScript instead using an array use an ArrayBuffer or typed array for bitmaps. or iterate differently in js than in others, or use a native library that performs X algorithm optimally in JS. Or do some processing with the gpu !. etc.

I'm just curious about how or if haxe solve this problem and recommendations. How the code would look like having macros/injections for several languages in arbitrary parts ?

Opinion about the API

The library is simple and focus on what it should and don't have dependencies. But could be hard to use since users are responsible parsing/loading images, implementing the step iterations. But IMO this is OK and these feature should not be supported.

I do think there could be other libraries on top of it that solve these problems, but independently. In my case I made a command line tool that exposes the API and supports some image formats, implements the iterations given by the user and exports to a file in a format the user wants.

Now I'm working on migrate these as an -extra project that provides API for this and add more APIs, besides mentioned, being able to control the iteration loop.

A problem I see is it lacks with docs to get started / or minimal working examples / guidelines. I would also like to work on this area, when I feel more comfortable with haxe technology.

Again I would love to hear your experience / thoughts on these. Sorry for the long comment, Thanks!

Tw1ddle commented 5 years ago

Hi @cerdic - low-weight preview SVGs should be a good use case for this! In case it's a useful reference, I believe this library can do something similar for previews: https://github.com/axe312ger/sqip

Jose M Perez also wrote a good article about placeholder SVGs a few years back if you haven't seen that: https://www.freecodecamp.org/news/using-svg-as-placeholders-more-image-loading-techniques-bed1b810ab2c/

As @cancerberoSgx pointed out, it'd be useful to have a benchmark that compare performance across different targets.

Some people took interest in using this library with the php target a few years ago, but they didn't like the generated code and performance. I gather it's harder to use platform-specific defines/#if/#else blocks to get around that, compared to other targets like js. So I think your approach of editing the generated code directly is probably the best way to optimize.

I'll take a look at the improvements you made, appreciate the feedback. Let us know how you apply the project, it would be nice to add a links in the readme if you create a site or app using such previews :smile:

Cerdic commented 5 years ago

@cancerberoSgx just like you I never heard about Haxe before digging in that project and juste discovering the entire ecosystem.

In fact at the begining my idea was also to work on the Haxe version and use the code generation to get a working PHP version.

But, even if things are not fully decided yet as I only did a small start in the PHP refactoring, I'm more and more thinking to really fork for a real full PHP lib.

This is really annoying for the long term maintenance because of course meaning that the future improvements in geometrize-haxe will have to be reported manually. But I can't see how i would be able to keep the link with the source geometrize-haxe, as well as

The performance thing is really an issue for my use case, and I think this will also need fine tuning, which is here totaly dependant of the language

But sharing some unit test would be great, I can agree with that! :)

Cerdic commented 5 years ago

@Tw1ddle thank you for the feedback. I know the sqip tool and the article from Jose M Perez which is my inspiration source. SQIP is not a solution for me as for security reasons I can not rely on an external binary tool, and neet a full PHP implementation.

So I was really happy to be able to use the geometrize-haxe PHP generated code as a first version that allows me to test the concept and check that it is possible to reach a fast enough generation speed to get something usable.

At the moment a 75 triangles SVG generation from a real life JPEG image is consuming something around 15 to 18s. This is still a lot but manageable:

And I have good hope to imprive this computing time by some more optimisations.

At the moment this is a work in progress, on a development website not yet visible in a real lif, but I will of course point you as soon it is visible somewhere.

cancerberoSgx commented 5 years ago

At the moment a 75 triangles SVG generation from a real life JPEG image is consuming something around 15 to 18s.

this sounds too much, how big is the input image ? what values for runner options are you using ? Could you share the steps iterations code ? Maybe you are doing something unnecessary there. I general you just need to store each step result and only build the svg after iteration finish.

I was looking at your library and seems the generated code did take care of using some PHP native structures (unlike in js that translation is almost straightforwards from haxe code) so my guess was it should be faster than js'. But those numbers are much slower.

Cerdic commented 5 years ago

@cancerberoSgx some more details

About measurment:

About settings:

cancerberoSgx commented 5 years ago

I'd try candidateShapesPerStep ==50 and experiment what's an acceptabe trade off . and perhaps in php you could do the resize and all extra things inside the loop in a different thread ? (something is not possible or easy in js) i wonder how big is the input bitmap . It 'd be interesting to compare these numbers with a C++ version too... my two cents

Cerdic commented 5 years ago

FYI I was able to cut the computing time by about a half by replacing all Haxe structures like _hx_array() with native PHP arrays and using native PHP cloning instead of the haxe hclone() methods.

This makes me now sure that the PHP version is definitively a fork :(

With some other slight improvements, refactoring and an interresting algorithm improvement on random shapes #22 I can now reach the 100 triangles shaping in 1 to 1.5s on the production serveur, which is much better

cancerberoSgx commented 5 years ago

I'm playing to learn haxe in my own project , https://lib.haxe.org/p/bitmap/ taking some concepts from here. I've noticed that is faster to clone bitmaps by using bites.sub() / bytes.blit() (that should use the native array copy() method. Also I'm making Sure.sure() statements optional for critical operations like get/set pixel and also for this project I would do in Util. I'm taking notes to later contribute to this project.

francescoagati commented 5 years ago

inline code should help for performance and also force inline di haxe 4

francescoagati commented 5 years ago

and also convert typedef in classes and use structinit for mutate php array map in real classes