MafiaHub / OpenMF-Archived

Abandoned C++ version. Contains useful format utils and parsers.
GNU General Public License v3.0
128 stars 11 forks source link

Rethink/rework single compilation unit #105

Closed drummyfish closed 6 years ago

drummyfish commented 6 years ago

As for this.

volca02 commented 6 years ago

Author of the comment here. I think it is a good time to do this now before the project grows larger. If you like the code structure as is, a compromise would be to guard the implementations in a ifdef section and prepare an artificial .cc file that includes the headers for a component with the appropriate define.

I think a better way would be to use CMake's ability to do LTO, as discussed here: https://stackoverflow.com/a/47370726

drummyfish commented 6 years ago

This is going to change the structure of the whole codebase so we should now all focus on this issue before continuing further work. Give me a sec to study what you linked.

drummyfish commented 6 years ago

Sooo, I think we don't need any half-way solutions, the typical one-header-one-cpp structure would be okay. I.e. I'd say just keep the headers we have and move the implementations into .cpp files.

About the LTO - it's nothing that would affect the repo structure, right? Don't know if I understand it correctly, but it's just an optimization technique you can turn on, off, right? Besides, does VS support it?

volca02 commented 6 years ago

Yes, LTO does not influence the source code structure, it just postpones most of the compilation steps into a single compilation unit, started by linker. It really is just meant to improve optimization - and thus can be implemented any time.

From what I read on wikipedia, LTO should be supported by MS C compiler. Even if it weren't, I don't think it would be that important optimization step. There is a lot of headroom to implement the engine, the original ran on potatoes compared to modern CPUs and GPUs.

drummyfish commented 6 years ago

Great great, let's see if others have anything to add, then we can decide who's gonna tackle this. We'll need to test all platforms plus update development docs.

inlife commented 6 years ago

i think its better to stick with single compilation unit one-header-one-cpp structure will bring even more problems and complexity to the project

what is the reason you want to do this ? cuz i've read the reddit part, and i didnt fully understand the reasons

volca02 commented 6 years ago

The main reason is quite simple - the current structure will lead to vastly increasing incremental compilation times as the project grows. Every single file edit will cascade into the rebuild of the whole project.

Even now, the incremental build times are already quite long, and the project is in it's early stages. There really is no difference between incremental and full builds now.

drummyfish commented 6 years ago

Well, with SCU if anything changes, everything has to be recompiled again, plus it suffers from having multiple executables - for each executable (i.e. every format util, viewer, tests, game) everything is being recompiled again and that takes a lot of time now. That significantly slows down the development.

If we switch to header-cpp structure, only the modules that are changed get recompiled, which is much faster. Other game projects do it this way, so I guess it really is better in the long run.

inlife commented 6 years ago

oh i see, so the main concern is actual build time

so I guess it really is better in the long run.

nope

the real root of the problem of those building times lays in the code we are using/writing i would rather recommend revisiting the dependency tree

if done right, SCU will result in max time of 1-2 sec for the whole codebase furthermore it can be combined with mutli-object SCU with even with LTO support, making a hybrid-like structure, however keeping the ideas of SCU for most of the stuff

drummyfish commented 6 years ago

if done right

Okay, but why is it taking so much time now? Are we not doing it right?

Also, from the reddit comment:

It leads to problems with multiple definitions, among other things.

I don't really get this - can anyone explain?

volca02 commented 6 years ago

Well you can still do Unity builds with split .h/.cpp codebase - by including individual cpp files into unified file(s) and compiling that instead of the individual ones. I find this technique flimsy though - it can lead to differences between incremental and unity builds - probably a bit hard to maintain.

Bruce Dawson has a writeup on this https://randomascii.wordpress.com/2014/03/22/make-vc-compiles-fast-through-parallel-compilation/

volca02 commented 6 years ago

@drummyfish What I mean by that is if you'd introduce an intermedate step, you have to control what symbols the intermediate files contain - header contained routines will get multiple exposures (will be in multiple object files), and linker will get confused.

drummyfish commented 6 years ago

@volca02 I see, thanks.

TBH I don't have experience with this. We might write down advanatages/disadvantages of each approach, see how each one applies to our project and then if we don't aggree maybe vote? Dunno.

zpl-zak commented 6 years ago

I believe we can practise approach like this:

our components/ folder currently represents many different components that can be separated onto its their own respective libraries. Shall we say we have our renderer, while we keep all renderer-related files header-only, we also introduce a file called package.cpp, that will include all these files and compile to a static library. This can be performed per each component we have.

Now you might ask, how is compiling each individual component as a static library worth it, when we still include those header-only libraries that contain the definition itself (the supposedly .cpp code), that's where we can get smart and make use of macro directive, which, when defined, will include the .cpp respective part when compiling the component itself, while omiting it when the header gets included in our target executable.

Example:

comp.hpp


class TestFoo {
   void bar();
};

#ifdef PACKAGE_IMPLEMENTATION

void TestFoo::bar() {

}

#endif

where package.cpp does this:

#define PACKAGE_IMPLEMENTATION
#include "comp.hpp"
#include ..
volca02 commented 6 years ago

@drummyfish It's really up to project lead, and the current state is sort of okay, but it does not get any benefit of incremental build times.

Where possible, I tend to use PImpl to hide implementation in the .cc files and lessen the include complexity. You can also forward declare a lot of stuff if you pass by reference or pointer.

@zaklaus Yes, I believe that's what I described as the compromise approach.

drummyfish commented 6 years ago

@zaklaus yeah @volca02 suggested that - what's the advantage compared to header-cpp style here? That we only have half the files (headers only, plus the additional packages ofc)?

inlife commented 6 years ago

Okay, but why is it taking so much time now? Are we not doing it right?

mainly because of tons of heavily templated code coming from osg for example, math lib implementation etc huge dep trees of include files, slowing down the compilation by forcing compiler to open, read, parse each and every file from every dependency, especially that relates to, again, osg

it can be fixed tho, by reducing templatization, eleminiating deps, abstractization, and maybe even defining minimal-style header files used for particular library, to export and later link only things that are actually needed

drummyfish commented 6 years ago

@inlife We have our own templates too, e.g. the cache loader. Also I wouldn't want to be forced to not use templates later on by the build system, we shouldn't get limited by that. The solutions you describe seem like more hassle than simply adding one extra file for each module to me.

drummyfish commented 6 years ago

I'm kinda interested in what @zaklaus has written, but I can't see why go just half the way? Having .o files does the same job as building static libraries, doesn't it?

inlife commented 6 years ago

@zaklaus's proposition is exactly the thing that needs to be done

drummyfish commented 6 years ago

I'm in, as soon as anyone enlightens me about what I have asked.

Also, anyone feels like implementing? :)

volca02 commented 6 years ago

@inlife I reckon the deep and complex dependencies are the good reason to hide them in some low frequency (change wise) .cc file(s) that just abstract things out and simplifies the compilation for other units.

inlife commented 6 years ago

the cache loader

this template code is quite minimal, it shouldnt take as much compile time as other encounters

forced to not use templates later on

its not forced behavior, its more like recommendation. it's just the fact that templates quite heavily increase build time

but I can't see why go just half the way?

the idea is to have easier code to maintain. simplicity that you have now, will pretty much still be there with @zaklaus suggested approach. as he suggested the hybrid approach, which means NOT ALL components will be compiled and linked. i'd even say most of them will be still used as header only files. its just about the ones that are slowing the whole thing down, they should be optimized.

Having .o files does the same job as building static libraries, doesn't it?

don't forget that linker also quite slow thing, i would say even the slowest in the build process. im sure LTO can optimize this, but we still need to keep amount of those badboys as low as possible.

Also, anyone feels like implementing? :)

i believe this whole stuff can be made over time, naturally. it doesn't require any rapid changes from anyone. we (me or @zaklaus) will try to show an example of such approach soon.

so pretty much i don't consider this issue as blocker

drummyfish commented 6 years ago

Thank you @inlife - so you mean only apply this to some modules? I'd like to keep consistency at least, i.e. rewrite the current code so that every header has the guard (so that we can treat them all the same), and group them all to a few "cpp modules". Shouldn't be difficult. @zaklaus what do you think?

zpl-zak commented 6 years ago

Yeah, why not. That could work fine.

drummyfish commented 6 years ago

I mean, apply to all modules or just some of them? Mixing the old style with the new one would be confusing I think.

inlife commented 6 years ago

yes, to package.cpp thing, only to some of them. guards can be added to all of them tho.

drummyfish commented 6 years ago

Actually, adding the guards to the headers that do not have the package file would require to unlock the implementation when including them, which would also unlock the implementations in other included files, unless we use different define for each header (bad), and even then you'd have to know which headers have packages and which don't when including them (so that you know whether to unlock them or not). That's bad.

So the guards should only be in the headers that do have the packages.

EDIT: Am I right?

Also, which files should we package then?

zpl-zak commented 6 years ago

Yes, only files that are in let's say 4ds folder would basically have packages. Files that are considered as an util can stay read-only actually. There is possibility that format parsers/loaders could be also considered as a single package on its own, but that depends on whether that's intuitive to manage or not. These packages can then be used in our apps/ easily, so changes made in model_viewer won't trigger the whole compilation process anymore.

drummyfish commented 6 years ago

Don't know if this whole package thing won't just add more complexity though, we'll have to basically manage additional thing: packages, i.e. think about what to group together etc. Plus new devs might find this confusing. I'd personally find it simpler to have a straightforward approach of the header-cpp style, which would basically only add more files, but no additional mental burden.

Anyway, we may try this package approach to test it at least, it could still be changed anyway. I'd try to implement it, but I won't have time today, so if you want, you can give it a try and make a PR and I'll take a look at it tomorrow (don't forget to update the dev docs etc.). If there's no PR tomorrow, I'll make one.

volca02 commented 6 years ago

@drummyfish I agree with you here. It is a straightforward and a usual approach to do header and implementation parts separately. You could after all turn the logic around and do .hpp/.cc pairs, and include the implementation .cc in the header file if a certain macro is defined - allowing to optionally do a single unit compile.

drummyfish commented 6 years ago

Interesting point, I'll go over all this tomorrow again, we can even make two different PRs and compare some concrete code for readability, speed etc.

RoadTrain commented 6 years ago

Hi @drummyfish, if it's ready, won't you make a PR for .hpp/.cpp split?

drummyfish commented 6 years ago

Dunno, I thought we abandoned it - what's the current plan then?

volca02 commented 6 years ago

It is merged (the @drummyfish rework into .hpp and .cpp files) in the optimization branch, and from my tests it compiles faster in both full (~1/2 the time of the master build) and incremental builds (understandable), and even with LTO it full-rebuilds faster by 7 seconds.

RoadTrain commented 6 years ago

I think we shouldn't throw all changes in one branch. Better do it incrementally and in separate PRs. If splitting provides better compilation times, we should merge it. Other optimizations should be evaluated separately, on their own. So I'm in favor of merging changes related to spliting. My 2 cents.

volca02 commented 6 years ago

I agree.

If this would be merged please also cherry-pick b292906d79a1bbf5ad1a7ac76a4f05f068cabc1f, 797afd14655b53c89954874f0ffe41c0bd44bd66 and 5e575d6b9c27b8bf38e37c0365e91f1f705b4923

The first one speeds up the builds of this rework significantly (it shares object files between targets).

The second one is not mandatory, but it is an example of how to further improve the build times by hiding header dependencies into cpp files.

The last one introduces an LTO switch. From my tests this builds a bit faster than master, with comparable optimization benefits.

RoadTrain commented 6 years ago

@volca02 No sure if any of these optimizations are still applicable?

zpl-zak commented 6 years ago

b292906 and 5e575d6 was integrated into the main branch. Is there anything else we could do in the current revision or should we close this issue? @volca02 @RoadTrain