GreycLab / CImg

The CImg Library is a small and open-source C++ toolkit for image processing
http://cimg.eu
Other
1.51k stars 287 forks source link

compilation could be way improved by splitting CImg in several files #273

Open matthieu-nesme opened 4 years ago

matthieu-nesme commented 4 years ago

I well understood you like the single header file distribution, but it has some severe drawbacks.

The file is huge (more than 3Mo for a header file is ridiculous). Each time it is included, it creates huge compilation units, struggling the compilation.

Some suggestions for improvements:

Let me try to give you a simple example that should be easier to understand :

the lib

usage

pre-instantiation (needs to be linked with cimg lib)

working with CImg<char> (needs to be linked with the pre-instantiation)

I hope this is useful and that I do not wrote mistakes.

dtschump commented 4 years ago

The file is huge (more than 3Mo for a header file is ridiculous).

What is a non-ridiculous size for a header file then ? Is there some "official" threshold we shouldn't exceed ? ("640K ought to be enough for anybody.", as it seems Bill Gates said ?).

Each time it is included, it creates huge compilation units, struggling the compilation.

Indeed, the compilation is a bit longer, but not to a point that becomes unbearable (unless you use a bad compiler, or an old version of a compiler). I personally wrote hundreds of small or complex codes based on CImg, these last 20 years, without feeling it's really a problem (and I'm not patient by nature). That's my personal impression BTW (this overhead is actually noticed for each use of a template-based library, not only CImg).

a common way to optimize the template compilation is to pre-compile (pre-instantiation) for certain fixed types extern template in a header, and real template instantiation in a cpp file, but still you have to include the whole huge file. A way to alleviate that is to split the function headers and their implementations in several files, even for templates. So the cpp file that needs to instantiate the template classes with a new type needs to include both (it does not change anything for it), but the files who simply needs to use these pre-instantiated type simply have to include the super light header file containing no code (not relevant here).

Actually, this thing has already been tested by a colleague of mine, with mixed results. It's in fact almost impossible to pre-compile the template library classes only for a few useful template types T (as unsigned char, int, float, …) mainly because a lot of the methods of such classes actually take one or several arguments with types CImg<t> or CImgList<t> themselves, where t is usually different from T (genericity principle that we want to keep, of course). At the end, either the number of methods you decide to pre-compile is not that big (those without template arguments, and basically the lightest ones that doesn't really need to be pre-compiled), or if you decide to compile all the combinations of methods needed with common types, even with a few of them, you end up with a huge pre-compiled library file (containing not even 1% of functions you'll actually use in your code). This would be what I call a bloatware :)

every static variables that are independent from template instantiation (mainly the enormous font variables) could be included and compiled once for all in separated cpp files. So each compilation units does not have to handle them at each include.

The whole size of these static variable is ridiculously low compared to what the other things the compiler has to do when compiling a CImg-based program. Putting these few Kb in a separate file that can be pre-compiled won't probably change anything in the compilation time. You may try and tell us anyway. Most of the compilation time is actually spent by instanciating the template methods (and optimizing the resulting code, enabling optimization flags do really make a big difference both in terms of compilation speed and memory usage), and as I explained above, this can be hardly optimized from the compiler point of view, without losing genericity or ending up with a huge library file. Both options are not valid from my point of view.

Anyway, if you really feel there is something different to experiment, I really encourage you, and I'd be interested by what you get.

matthieu-nesme commented 4 years ago

a lot of the methods of such classes actually take one or several arguments with types CImg or CImgList themselves, where t is usually different from T

I agree the worst case scenario is as bad.

It's in fact almost impossible to pre-compile the template library classes only for a few useful template types T

Not necessarily in the lib itself, but in the using application it is. So hopefully the common (best?) case scenario could be well improved.

if you decide to compile all the combinations of methods needed with common types, even with a few of them, you end up with a huge pre-compiled library file

Indeed, that is another problem to consider.

Putting these few Kb in a separate file that can be pre-compiled won't probably change anything in the compilation time.

I find it a strange way to think. If everyone thinks the same, what is the consequence? It does not require huge effort, it cannot be worse but it is necessarily better (how better will depend on you project). We'll try to take time to test it.

dtschump commented 4 years ago

Not necessarily in the lib itself, but in the using application it is.

In that case, the application can of course create a pre-compiled module where the used methods are compiled in a separate .o file. If you include CImg.h in another source file that does not use these template methods, they won't actually be recompiled, so I don't see any issue there.

It does not require huge effort, it cannot be worse but it is necessarily better.

I don't see why it would be better. Linker usually knows how to keep a single version of the static variables in the final code. Having several files to install for CImg is a worst case, IMHO.

matthieu-nesme commented 4 years ago

I am not concerned about the binary code efficiency but about the compilation+link stage. The less is included, the less the compiler has to manage.

If the idea is to let the compiler manage everything itself, let's push the idea to its absurdity. Let's forget about forward declaration and let's include everything everywhere, it will work, it will give the same binary code but compilation will be horribly not efficient.

CImg is one step in that direction, maybe not significant enough but for sure not as efficient as it could be.

Having several files to install for CImg is a worst case

Really?! If you really want, you can always add a top file that includes everything to obtain the same behavior (that would be a terrible idea), but splitting could offer improvements for the non-uncommon cases for a cost that seems pretty low to me.

Anyway, I am just one more voice to suggest you to reconsider your position. If we find time to make proper testing, we'll let you know. BTW thanks for the awesome job that is well used ;)

dtschump commented 4 years ago

If the idea is to let the compiler manage everything itself, let's push the idea to its absurdity. Let's forget about forward declaration and let's include everything everywhere, it will work, it will give the same binary code but compilation will be horribly not efficient.

You're extrapolating my words to lend me absurd intentions, which isn't entirely honest. Modifying a code structure, that has proved to be simple and convenient to use, and making it more complex, just to help the compiler doing his job faster is IMHO a very bad idea. Maybe improving the compiler to better manage this case would be something more desirable (that has actually happen with g++, from a few bug reports made by us and others using CImg).

Really?! If you really want, you can always add a top file that includes everything to obtain the same behavior (that would be a terrible idea), but splitting could offer improvements for the non-uncommon cases for a cost that seems pretty low to me.

My guess is that you will anyway forced to do this at the end, because there are almost no cases where splitting the CImg file into several part can be more efficient (except maybe for toy cases).

If we find time to make proper testing, we'll let you know. BTW thanks for the awesome job that is well used ;)

As I said, I'm interested by your outcome. I obviously won't try it by myself. It's something I've thought about for 20 years without finding a better alternative, so let me just encourage you and wish you good luck. You seem to be a better developer than I am.

jboulanger commented 4 years ago

Hi,

Here is my take on the issue. This is a recurring question from people new to the library and had been many time debated. Indeed, the design can surprise people and just editing the file can become difficult with slow editors. Luckily we're not supposed to edit it too much when using it.

The compilation time of CImg can slow down development when you're trying out things as you can be used to when using interpreted languages (matlab, python, etc).

That said I totally agree with David's suggestion and use it with my own projects. Split your code in bits that read, visualize or do processing and different/non overlapping parts of the library will be used. Use the define_cimg_use_xx to discard display if the code doesn't need it. You'll notice compiling img.load() or img.display() takes a big chunk of the compilation time. Then these bits can be compiled in parallel (-j 8 anybody?). It is however more difficult and probably useless to predict which function would be worth being compiled in advance in a general case to create a super duper binary. Using optimization flag (-O)properly also helps developing faster.

Now think of the advantages, a single header file makes it really portable and easy to include in projects. You basically don't care about dependency or lib versioning as it is too easy to just have it in the source tree. The design of the lib makes it almost impossible to disentangle the different part of the code by type or function.