MRtrix3 / mrtrix3

MRtrix3 provides a set of tools to perform various advanced diffusion MRI analyses, including constrained spherical deconvolution (CSD), probabilistic tractography, track-density imaging, and apparent fibre density
http://www.mrtrix.org
Mozilla Public License 2.0
294 stars 181 forks source link

New folder structure for C++ code #3012

Closed daljit46 closed 1 week ago

daljit46 commented 1 month ago

This PR aims to partially address #2824 and provide a new folder structure for building C++ code. In particular:

This is still WIP so many things don't work yet, but I'm just posting the changes now to get early feedback on CI.

daljit46 commented 1 month ago

@Lestropie tests for dirrotate still fail at times https://github.com/MRtrix3/mrtrix3/actions/runs/11053081048/job/30706613851.

daljit46 commented 1 month ago

@MRtrix3/mrtrix3-devs This is now ready for feedback (I'd especially appreciate any input on the new version matching logic). I wouldn't be against merging this soon even if it will most likely break the status of other PRs. Other than the relocation of C++ files, the changes are fairly minimal so it shouldn't be difficult to merge them into existing PRs.

Lestropie commented 1 month ago

So it looks like as far as filesystem structure is concerned, you've essentially done the following moves:

cmd/  -> cpp/cmd/
core/ -> cpp/lib/core/
src/  -> cpp/lib/

?

Whereas my thinking had been that it would be:

cmd/  -> cpp/cmd/
core/ -> cpp/
src/  -> cpp/

The disadvantage here is that it would no longer be trivial to take a single directory and say "compile everything in this directory into a single shared object". I don't recall the outcome of the most recent discussion here, whether we wanted to remove such a large shared object in favour of many smaller ones or whether to store within a CMakeLists.txt an explicit list of files to be included in such an object.

I'd especially appreciate any input on the new version matching logic

Are you able to give a concise description of how it differs? I can see some renaming and some moving of code, but change to the logic isn't jumping out at me.

daljit46 commented 1 month ago

So the idea of using cpp/lib and cpp/cmd was to separate executable code from library code. The distinction between core and src was only useful to the extent that if you modified a core header/source file, the commands would only need relinking against mrtrix::core. I think one good test to assess whether the new structure is significantly unfavourable in this regard is to test whether relinking of the mrtrix::headless library takes a long enough time to be a hindrance (e.g. by making a small change in cpp/lib). I guess that other than on MSYS2, it shouldn't be.

The disadvantage here is that it would no longer be trivial to take a single directory and say "compile everything in this directory into a single shared object".

I'm not sure if I understand what you mean here. Isn't that what is being done with this proposal? All code in cpp/lib is built as a single shared object. Are you perhaps referring to the ability to create a folder inside cpp and have it automatically built as a separate shared library? If yes, I believe that should also be possible with the current proposal.

I don't recall the outcome of the most recent discussion here, whether we wanted to remove such a large shared object in favour of many smaller ones or whether to store within a CMakeLists.txt an explicit list of files to be included in such an object.

Yes, I think we discussed this a couple of weeks ago with @jdtournier. If I recall correctly, our conclusion was unless relinking the commands was long this should ok. However, I do agree that to get the fastest possible build workflow, splitting the code in `cpp/lib ' into separate libraries would be ideal. To do that I think the best thing to do would be to manually specify the libraries needed by individual commands. If needed, we could provide that ability in a future PR?

Are you able to give a concise description of how it differs? I can see some renaming and some moving of code, but change to the logic isn't jumping out at me.

The logic is roughly the same, but I just wanted a sanity check to be sure that things work as expected. The main change I've done is that now the version mismatch checking is done in command.h (instead of app.cpp) and that mrtrix::executable-version is only linked against executables (previously it was linked against mrtrix-headless as it was distinct from mrtrix::core). One thing I was doubtful about was that previously we had a function set_executable_uses_mrtrix_version that was called at run time inside command.h. On the other hand, in core/version.h we were using extern variable without any functions. I wasn't too sure why things were done that way, but I switched to using extern variable for both version header files.

Lestropie commented 1 month ago

So the idea of using cpp/lib and cpp/cmd was to separate executable code from library code.

I had expected that this would just be done based on cpp/cmd/ vs. cpp/(?!cmd/).

The disadvantage here is that it would no longer be trivial to take a single directory and say "compile everything in this directory into a single shared object".

I'm not sure if I understand what you mean here.

I think maybe you're expecting a greater degree of understanding of the proposal on my part than what I currently have, and maybe my comment wasn't precise enough.

Previously, we had cmd/, core/ and src/. Anything that was in core/ would contribute to the shared object file, whereas anything in src/ went into its own object. If, however, as per my proposal / expectation above, the contents of core/ and src/ were to be merged at the root level of cpp/, yet one also wanted for a file that used to be in core/ to go into a shared object but a file that used to be in src/ to not go into the shared object, that would no longer be possible based on filesystem location alone, you'd need an explicit list of which files go into the shared object and which don't. But that's predicated on persisting with the desire to compile only a subset of non-command-specific source files into a shared object.

If instead it is the case that all non-command-specific code is to go into a shared library file, as indicated by:

All code in cpp/lib is built as a single shared object.

, then I see no purpose in persisting with a cpp/lib/core/ sub-directory that separates itself from the rest of cpp/lib/. Historically the defining attribute of the core/ vs src/ filesystem sub-division was inclusion vs. exclusion from the shared library, which is precisely what's being deprecated.

Personally, I would over and above that remove the lib/ sub-directory. The headless library would just encapsulate everything in cpp/ not in cpp/cmd/ or cpp/gui/. I find having the root of cpp/ containing just two sub-directories a bit clunky. But it's not up to me exclusively.

daljit46 commented 1 month ago

In the meeting this morning with @jdtournier and @bjeurissen, it was discussed that since the separation between core and src is no longer a benefit, all files inside core should be moved to cpp/lib. However, as suggested by @jdtournier, I do believe that it's a good idea to keep a separation between library code and executables and that .cpp command files should not reside in the same directory as library files. Additionally, we discussed the idea of moving all GUI code up one level so that there would be three folders inside cpp: cmd, lib and gui (or libgui). @Lestropie any thoughts are welcome.

jdtournier commented 1 month ago

Thanks @daljit46 - just one minor correction: this is the structure that I think is most logical (though not necessarily the most convenient):

└── cpp/
    ├── cmd/
    └── lib/
        ├── core/
        └── gui/

Both folders in cpp/lib/ would produce libraries (whether dynamic or static, that's another debate), while everything in cpp/cmd/ produces executables. I think that filesystem layout most closely matches how I would envisage the code hanging together.

We also had a debate about what to name the folder for the main library, as there was a suggestion to name it headless/ or cli/, but @bjeurissen made the point that this might actually give the impression that this contains code that is exclusively for terminal-based apps, which clearly isn't the case: the functionality in the main library will clearly be part of any GUI apps as well. So we thought sticking with core/ was a closer reflection of its role than these other alternatives.

daljit46 commented 1 month ago

Thanks @daljit46 - just one minor correction: this is the structure that I think is most logical (though not necessarily the most convenient):

└── cpp/
    ├── cmd/
    └── lib/
        ├── core/
        └── gui/

Just rebased to implement this structure.

Lestropie commented 1 month ago

I find that clunky, having two sequential filesystem hierarchy levels that only discriminate between two sub-directories. I personally wouldn't choose that over the intermediate:

└── cpp/
    ├── cmd/
    ├── core/
    └── gui/

This requirement:

Both folders in cpp/lib/ would produce libraries (whether dynamic or static, that's another debate), while everything in cpp/cmd/ produces executables

takes very minimal CMake code; I don't see why that distinction has to be made at the filesystem level.

jdtournier commented 1 month ago

Like I said, most logical, though not necessarily most convenient...

The structure you're suggesting is actually what I had in mind initially, and what I'd prefer in practical terms. The lib/ folder is only there to explicitly signal the intention, but I agree it's otherwise entirely redundant. I'm quite happy to ditch the lib/ bit, personally.

daljit46 commented 1 month ago

I think the distinction between cmd and lib is mostly on philosophical grounds. The idea is to mark a clear delineation between library code and executables. Ideally, our CMake build files would be "in harmony" with how the files are laid out on disk. Having that said, I'm also not opposed to giving this up. Perhaps, an alternative naming convention could be:

└── cpp/
    ├── cmd/
    ├── libcore/
    └── libgui/
daljit46 commented 3 weeks ago

As discussed in the meeting earlier, the structure is now:

└── cpp/
    ├── cmd/
    ├── core/
    └──gui/

Additionally, gui/opengl/gl.h has been renamed to gui/opengl/glutils.h to avoid conflicts with system headers.

daljit46 commented 3 weeks ago

@Lestropie @jdtournier ready for review and merge.