Closed yarikoptic closed 4 years ago
Ouch. This might take some finessing...
First off, I'm surprised this hadn't turned up as an issue before, that code hasn't changed in years. I would have expected the conflict with glext.h
to have shown up a lot earlier than this. Where does this file come from? Is this the version supplied by mesa
?
So there two reasons we might not have had a conflict before:
typedef
isn't normally defined in glext.h
typedef
is defined, but matches ours, so doesn't raise any issuesI'm not sure which might be the problem. I have to admit I've not compiled a 32 bit version for many years now, so there's a chance ours isn't right. It has to match the standard typedefs, which in this case means a 32 bit signed int. There's a distinct chance that the ptrdiff_t
we use might not be 32 bit (hard to see how it could store differences like (2^32-1)-0 as a signed quantity otherwise). In this case, we could modify it to be ssize_t
(but I'm not sure it's guaranteed to always be 32 bit on a 32 bit system), or in extremis int32_t
(but we'd need an #ifdef
in there as well).
Happy to investigate, but I don't have a 32 bit Debian system handy... What's the simplest way forward?
from KHR/khrplatform.h:
typedef signed long int khronos_intptr_t;
typedef signed long int khronos_ssize_t;
(it's a long long int
on Win64, for some reason). But the main thing is it looks like this is also how ptrdiff_t
is defined by the GCC headers - assuming it hasn't already been set by some other system header. So we'd need to figure out where ptrdiff_t
is actually defined, and see whether it matches (there's a chance it might be defined as a regular int
, which would cause a type mismatch even if it's actually equivalent on 32 bit systems). That, or we replace our definitions to match the standard exactly...
Happy to investigate, but I don't have a 32 bit Debian system handy... What's the simplest way forward?
I think the best is what I have recommended awhile back to vowpal wabbit is to (auto)generate Singularity images on singularity hub. You would just need to add them to this repo and enable building on singularity hub. See https://github.com/VowpalWabbit/vowpal_wabbit/tree/master/singularity for samples/starting point. Since we have mrtrix3 in debian already, just a matter of changing the line build-dep vowpal-wabbit
to build-dep mrtrix3
and may be a following up with the tools you would like to have available in addition. You can try to do yourself or let me know and I will send a PR.
Ok, I've had a brief look into this, but I'm not familiar with this, I'm going to need some hand-holding... Do I need to install singularity on my system for this to work? Unfortunately, I'm going to be snowed under with other commitments for a while, I'm not sure how much time I can dedicate to this - hence anything that can get me up and running quickly would be hugely appreciated...
Yes, install singularity locally. I will take care about PR and giving you an image to try
If on Debian based system - install/configure NeuroDebian and apt install singularity-container
@yarikoptic: Thanks for #1522 - that really helped. I've just committed changes to our GL headers that seem to avoid the issue nicely on the 32-bit build (although I still need to test them very thoroughly on all other platforms) - see the simplify_gl_core_header
branch.
However, we now get an unrelated issue at the very final link stage:
ERROR: (503/505) [LB] bin/tckgen
clang++ tmp/src/dwi/tractography/roi.o tmp/src/dwi/tractography/seeding/dynamic.o tmp/src/dwi/shells.o tmp/src/dwi/tractography/ACT/gmwmi.o tmp/src/dwi/directions/mask.o tmp/src/dwi/tractography/tracking/method.o tmp/src/dwi/tractography/mapping/mapper_plugins.o tmp/src/dwi/tractography/tracking/shared.o tmp/src/dwi/fmls.o tmp/src/dwi/tractography/SIFT/proc_mask.o tmp/src/dwi/tractography/resampling/resampling.o tmp/src/dwi/tractography/seeding/gmwmi.o tmp/src/dwi/tractography/resampling/endpoints.o tmp/src/dwi/gradient.o tmp/src/dwi/tractography/properties.o tmp/src/dwi/tractography/tracking/tractography.o tmp/src/dwi/tractography/ACT/act.o tmp/src/dwi/tractography/rng.o tmp/cmd/tckgen.o tmp/src/dwi/tractography/resampling/downsampler.o tmp/src/dwi/tractography/resampling/upsampler.o tmp/src/dwi/tractography/mapping/mapping.o tmp/src/dwi/tractography/tracking/early_exit.o tmp/src/dwi/tractography/mapping/mapper.o tmp/src/dwi/tractography/resampling/arc.o tmp/src/dwi/directions/set.o tmp/src/dwi/directions/predefined.o tmp/src/dwi/tractography/algorithms/iFOD2.o tmp/src/dwi/tractography/resampling/fixed_num_points.o tmp/src/dwi/tractography/resampling/fixed_step_size.o tmp/src/dwi/tractography/mapping/fixel_td_map.o tmp/src/dwi/tractography/seeding/seeding.o tmp/src/dwi/tractography/seeding/basic.o tmp/src/dwi/tractography/tracking/write_kernel.o tmp/src/dwi/tractography/seeding/list.o tmp/src/dwi/tractography/mapping/twi_stats.o tmp/src/dwi/tractography/file_base.o tmp/src/dwi/tractography/mapping/voxel.o tmp/src/exec_version.o -lmrtrix -Wl,--sort-common,--as-needed -pthread -lz -lfftw3 -Wl,-rpath,$ORIGIN/../lib -L./lib -o bin/tckgen
failed with output
/usr/bin/ld: tmp/src/dwi/tractography/seeding/dynamic.o: in function `std::atomic<double>::load(std::memory_order) const':
dynamic.cpp:(.text._ZNKSt6atomicIdE4loadESt12memory_order[_ZNKSt6atomicIdE4loadESt12memory_order]+0x4c): undefined reference to `__atomic_load_8'
/usr/bin/ld: tmp/src/dwi/tractography/seeding/dynamic.o: in function `std::atomic<double>::compare_exchange_weak(double&, double, std::memory_order, std::memory_order)':
dynamic.cpp:(.text._ZNSt6atomicIdE21compare_exchange_weakERddSt12memory_orderS2_[_ZNSt6atomicIdE21compare_exchange_weakERddSt12memory_orderS2_]+0x97): undefined reference to `__atomic_compare_exchange_8'
clang: error: linker command failed with exit code 1 (use -v to see invocation)
Looks like a 32-bit architecture doesn't support 64-bit atomic operations...?
Everything else links fine, this is presumably the only part of the code where we need a 64-bit atomic operation. I'd need to get @Lestropie's input as to what we can do here. The simplest is to use a float
for the 32-bit version, but this might be an unacceptable loss of precision. Otherwise, we'd need to think very hard about how we can get around this limitation, and how much effort it's really worth given that I don't think we would ever recommend running these types of applications on 32-bit system...
In the meantime, I'll run this again using g++
rather than clang++
, see if that works any better...
OK, turns out the build completes without error using g++
- must be a glitch in that version of clang
...?
For reference, command is:
CXX=g++ ./configure
./build
Everything else links fine, this is presumably the only part of the code where we need a 64-bit atomic operation. I'd need to get @Lestropie's input as to what we can do here. The simplest is to use a float for the 32-bit version, but this might be an unacceptable loss of precision.
Should be fine. I pretty much use double out of principle whenever summing a reasonable number of floating-point numbers. But as long as the streamline count in any particular fixel is << 1e6 it shouldn't really matter. Even then it'd only be throwing out dynamic seeding probabilities slightly, which are imperfect at best.
Everything else links fine, this is presumably the only part of the code where we need a 64-bit atomic operation. I'd need to get @Lestropie's input as to what we can do here. The simplest is to use a float for the 32-bit version, but this might be an unacceptable loss of precision.
Should be fine. I pretty much use double out of principle whenever summing a reasonable number of floating-point numbers. But as long as the streamline count in any particular fixel is << 1e6 it shouldn't really matter. Even then it'd only be throwing out dynamic seeding probabilities slightly, which are imperfect at best.
No worries. Thankfully, I don't think we'll need it, this builds just fine with g++
. I reckon we leave at that...
Thank you @jdtournier ! Do you think I should pick up that simplify_gl_core_header branch and package from it for a "Test build"?
Do you think I should pick up that simplify_gl_core_header branch and package from it for a "Test build"?
I'm not sure what that means exactly...? It sounds like it would be wise to do so, in case there's any further issues, but I'm not sure how much is involved in the process. Any reason you wouldn't want to try it out...?
From our point of view, I'd like to check that it doesn't screw anything up on macOS or Windows - but that would be caught by the CI testing anyway. I'm checking the Windows build now and can report back shortly. But I don't anticipate any issues, the changes I'd made are relatively minor, so shouldn't affect anything particularly. Once that's done and we're happy we can merge to master
, at which point you can check whether that builds OK. But from your point of view, I guess there's little difference between a test build with that branch or the master
branch, right? It's still not a tagged version as such...
Oops... (1c1d33047)
OK, seems to work on Windows too now.
With those two commits build now passes on i386 on recent Debian but still fails on older ones and Ubuntu:
neurodebian@smaug ..ds/mrtrix3/3.0~rc3+git135-g2b8e7d0c2-2 % cat summary.build
mrtrix3_3.0~rc3+git135-g2b8e7d0c2-2_amd64.build OK 17:57.18 real, 7455.25 user, 428.70 sys, 43765856 out
mrtrix3_3.0~rc3+git135-g2b8e7d0c2-2~nd70+1_i386.build FAILED 2:36.98 real, 105.15 user, 57.56 sys, 1258752 out
mrtrix3_3.0~rc3+git135-g2b8e7d0c2-2~nd70+1_amd64.build FAILED 0:00.02 real, 0.01 user, 0.00 sys, 0 out
mrtrix3_3.0~rc3+git135-g2b8e7d0c2-2~nd80+1_i386.build FAILED 4:13.32 real, 402.32 user, 87.36 sys, 2685592 out
mrtrix3_3.0~rc3+git135-g2b8e7d0c2-2~nd80+1_amd64.build OK 14:19.13 real, 5409.71 user, 358.05 sys, 23544368 out
mrtrix3_3.0~rc3+git135-g2b8e7d0c2-2~nd90+1_i386.build FAILED 4:08.64 real, 364.21 user, 72.22 sys, 2843296 out
mrtrix3_3.0~rc3+git135-g2b8e7d0c2-2~nd90+1_amd64.build OK 14:43.31 real, 6226.68 user, 367.72 sys, 29348280 out
mrtrix3_3.0~rc3+git135-g2b8e7d0c2-2~nd100+1_i386.build OK 18:10.81 real, 7624.30 user, 332.99 sys, 38428048 out
mrtrix3_3.0~rc3+git135-g2b8e7d0c2-2~nd100+1_amd64.build OK 17:30.84 real, 7446.94 user, 409.76 sys, 43583208 out
mrtrix3_3.0~rc3+git135-g2b8e7d0c2-2~nd+1_i386.build OK 18:10.15 real, 7610.01 user, 326.89 sys, 38438288 out
mrtrix3_3.0~rc3+git135-g2b8e7d0c2-2~nd+1_amd64.build OK 17:36.07 real, 7361.72 user, 418.83 sys, 43643504 out
mrtrix3_3.0~rc3+git135-g2b8e7d0c2-2~nd12.04+1_i386.build FAILED 1:24.74 real, 61.94 user, 28.93 sys, 189768 out
mrtrix3_3.0~rc3+git135-g2b8e7d0c2-2~nd12.04+1_amd64.build FAILED 1:26.16 real, 62.92 user, 29.58 sys, 247056 out
mrtrix3_3.0~rc3+git135-g2b8e7d0c2-2~nd14.04+1_i386.build FAILED 2:10.76 real, 89.75 user, 47.40 sys, 1538928 out
mrtrix3_3.0~rc3+git135-g2b8e7d0c2-2~nd14.04+1_amd64.build FAILED 2:12.49 real, 89.78 user, 49.08 sys, 1638208 out
mrtrix3_3.0~rc3+git135-g2b8e7d0c2-2~nd16.04+1_i386.build FAILED 3:19.40 real, 309.62 user, 47.54 sys, 2760928 out
mrtrix3_3.0~rc3+git135-g2b8e7d0c2-2~nd16.04+1_amd64.build OK 10:47.82 real, 5139.80 user, 309.29 sys, 24601208 out
mrtrix3_3.0~rc3+git135-g2b8e7d0c2-2~nd17.10+1_i386.build FAILED 3:08.88 real, 331.73 user, 47.65 sys, 2927448 out
mrtrix3_3.0~rc3+git135-g2b8e7d0c2-2~nd17.10+1_amd64.build OK 13:31.82 real, 6424.30 user, 310.02 sys, 29811136 out
mrtrix3_3.0~rc3+git135-g2b8e7d0c2-2~nd18.04+1_i386.build FAILED 3:35.25 real, 355.89 user, 54.11 sys, 3009584 out
mrtrix3_3.0~rc3+git135-g2b8e7d0c2-2~nd18.04+1_amd64.build OK 13:50.66 real, 6471.14 user, 333.96 sys, 30037016 out
@yarikoptic: we've been discussing the packaging for the proper 3.0 release (which we're hoping will come soon...). Given that you've been putting all this effort into getting MRtrix3 to build on Debian, can we assume that the intention is to make MRtrix3 available direct from the Debian repos? If so, that would solve a major headache for us...
Yes More precisely, I will be uploading to Debian unstable, so new versions eventually become part of Debian release. And I will upload backport builds to NeuroDebian http://neuro.debian.net/pkgs/mrtrix3.html?highlight=mrtrix3
@yarikoptic: are you still maintaining the Debian / neurodebian packages? We've just released 3.0.0, it would be great for that version to be packaged and made available!
I do but not alone any more - it is under the wing of debian med so more than myself attend to it: https://metadata.ftp-master.debian.org/changelogs//main/m/mrtrix3/mrtrix3_3.0~rc3+git135-g2b8e7d0c2-5_changelog . If nobody from the team reacts/updates -- I will do eventually.
or screw it -- will look into it now, will report if anything prevents it and I postpone
btw -- what about that simplify_gl_core_header
branch mentioned above in https://github.com/MRtrix3/mrtrix3/issues/1517#issuecomment-451151076 . We had it in our packaging, it was never merged into mrtrix3 master, and no longer applies without conflicts. Is it expected to build ok on i386 without patching now?
oh, if before we packaged straight from git -- now tried from distributed sources: apparently test data is not shipped at all within the releases here on github... any plans to include it? (not sure if easily possible through some .gitattributes instructions to include submodules into releases... do not remember investigating) I had pushed my changes to packaging to https://salsa.debian.org/med-team/mrtrix3.git , and will get back to it later or someone else from the team might beat me to it -- need to switch for now.
oh, actually some testing data is respectfully large! I guess we didn't even try to swallow the scripts
one:
$> du -scm testing/scripts/data testing/binaries/data
656 testing/scripts/data
61 testing/binaries/data
for debian it wouldn't be kosher, but for neurodebian backports I might just allow fetching it from the network... I wish I came up with an efficient caching mechanism for that to not refetch across different backport builds. BTW with a size of 656MB you might be interested to start considering git-annex/datalad to manage it ;)
OK, I'll look into that simplify_gl_core_header
branch now - though personally I don't think it's a good idea to ship 32-bit versions of MRtrix, we'd never really recommend it be used for anything serious. But if it's a simple fix, we can merge it. You might need to apply the patch yourself though, it's too late for inclusion in 3.0.0 unfortunately...
And the testing data is indeed a fair chunk, which is why we keep it separate. The scripts testing data is much bigger, and on top of that requires that 3rd party packages be installed (FSL, ANTs, potentially others). It also takes a much longer time to complete, and we don't use it in CI ourselves - far too onerous. It only gets run manually every so often.
I'm not sure you need to run the testing of the binaries either, given that version of the code will have been tested already on our side. But it depends what the policies are for Debian...
OK, I've cherry-picked the two commits from the simplify_gl_core_header
branch and pushed on top of 3.0.0
, and force-pushed that back to simplify_gl_core_header
. It seems to work fine on my system, but that's 64 bit Arch. I've create pull request #2060 for it to be included in master
so that it makes it to the next bug-fix release, and to allow the CI to at least try to compile it on other platforms. But we'll need to merge master
back into it before merging since there's been a couple of commits made to master
already, so it's moved a bit beyond 3.0.0
already.
For now, I reckon your best bet is to checkout 41e0146, which is the current tip of simplify_gl_core_header
. That one at least won't be affected if we update simplify_gl_core_header
with the latest changes on master
. Unless you actually want those changes, of course - but then the versions won't match exactly what we've released ourselves... Up to you.
Regarding testing, I'm not sure what the policy is, but if you really need to run at least the binaries tests, you could grab the test data by direct URL, but you'll still need to modify the run_tests scripts to avoid failing at the git submodule stage... Other option I suppose is to modify that bit of the script to fetch the relevant test data itself, but the problem then is that the commit hash of the submodule is not recorded anywhere in the direct download, so there's no easy way to figure out the correct package to download.
A much easier option is to stick to just the unit tests, which don't require any additional test data. Problem there is that they don't exercise anywhere near as much of the code as the binaries tests do...
Thank you! Re tests - Debian doesn't mandate testing. But I consider it crucial, in particular for scientific software. Building across wide range of Debian and Ubuntu releases using those system libraries and tools often brings unexpected fruits complimentary to what CI setups unravel. I will take care about including the smaller data for binaries, as we did before. We already had a patch to avoid git submodule call.
No problem, good to hear you already know how to handle this.
Just in case it helps though: I've always taken great care to keep the number of dependencies in MRtrix3 to an absolute minimum. The only core dependencies are a compliant C++11 compiler (which is why builds fail for anything older than 16.04), and Eigen (a header-only library). Also important: zlib (really should be pretty reliable across systems), and Qt (but it's only required for display, not for any actual processing). Optional libraries include TIFF, PNG (both very rarely used - only to import/export MR data to image formats) , and FFTW (also rarely used, mostly for mrdegibbs
). All this to say that I don't expect the binaries tests to perform differently across system versions than they did on our CI, which runs under ubuntu-latest
(18.04 currently) - though there may be differences between 32 & 64 bit versions.
However, this is less likely to be the case for the scripts. They'll need to run under different versions of the interpreter (e.g. python 2.7 or 3+), and importantly with different module versions. Also, I think the most likely points of failure are going to be in the interactions with other neuroimaging packages (FSL, ANTs, etc.). So I would expect if you really want to be sure that the packages all work together properly, you probably should run the scripts tests... But they are very heavy, both in terms of data and computation. @Lestropie may have some insights to share on that front.
For the scripts I have tried to restrict the precision of the tests to account for acceptable variability. E.g. In one script if the internally executed commands come from FSL 5 vs. 6 the results vary, so I include test data for both; for others there is stochastic behaviour within the external command, and so I don't test the output image intensities at all, the test just makes sure that the script completes successfully and the basic image header properties match those expected. There are some outstanding script tests that won't pass on master
/ 3.0.0
, but these are known (#2043). Success with both Python 2 & 3 should be guaranteed.
We already had a patch to avoid
git submodule
call.
It would be preferable if, instead of necessitating a dedicated patch, the run_tests
script were updated within the MRtrix3 code such that, if executed outside of a git
repository but valid test data are nevertheless present, the script will continue. Somewhat related to #1897.
It would be preferable if, instead of necessitating a dedicated patch, the run_tests script were updated within the MRtrix3 code such that, if executed outside of a git repository but valid test data are nevertheless present, the script will continue.
This would be good, but the issue here is how to verify that the data are valid - which presumably means they are the exact version recorded in the submodule. There is no way to validate the data from the downloaded ZIP file. I guess we could simply run git submodule
, and in the event of failure, issue a warning, but carry on testing regardless...?
the issue here is how to verify that the data are valid
I wasn't even thinking of going that far: just proceed if not within a git
module but the contents of testing/binaries/data/
or testing/scripts/data/
are non-zero. Version matching between pre-compiled code and whoever populated the data directory is the responsibility of whoever put the data there. Include a warning on stdout
to say that no version match is guaranteed.
content will match desired submodule state for the release, no worries.
Cool, happy enough with that. :+1:
The issue with OpenGL compilation on 32-bit should be addressed in #2060, and the utilisation of the run_tests
script when MRtrix3 is not installed via git
(i.e. by manually downloading the requisite test data) should be addressed in #2100. If these changes do not resolve all issues at your end following the release of version 3.0.1
, please let us know by posting a new issue (you can provide a link to this existing issue if relevant). :+1:
any immediate clues?