ComputationalRadiationPhysics / imresh

Shrink-Wrap Phase Reconstruction Algorithm
MIT License
3 stars 2 forks source link

Enabling Optimization breaks algorithm code linking #14

Closed Ferruck closed 8 years ago

Ferruck commented 8 years ago

In the current CMakeLists.txt on the refactoring branch (by now the most up to date), no parameters for compiler optimizations are given. This defaults to no optimization (at least on my machine).

When turning on optimization with anything higher than -O0 (e.g. -O1, -O2 and so on) linking the example binary fails with

[100%] Linking CXX executable examples
CMakeFiles/examples.dir/examples/createAtomCluster.cpp.o: In function `examples::createAtomCluster(std::vector<unsigned int, std::allocator<unsigned int> > const&)':
createAtomCluster.cpp:(.text+0xf2): undefined reference to `void imresh::libs::gaussianBlur<float>(float* const&, unsigned int const&, unsigned int const&, double const&)'
createAtomCluster.cpp:(.text+0xd4f): undefined reference to `void imresh::libs::gaussianBlur<float>(float* const&, unsigned int const&, unsigned int const&, double const&)'
libimresh.so: undefined reference to `float imresh::algorithms::vectorMax<float>(float* const&, unsigned int const&)'
libimresh.so: undefined reference to `void imresh::algorithms::cuda::cudaGaussianBlur<float>(float* const&, unsigned int const&, unsigned int const&, double const&)'
libimresh.so: undefined reference to `void imresh::algorithms::applyComplexModulus<float [2], float>(float (* const&) [2], float const (* const&) [2], float const* const&, unsigned int const&)'
libimresh.so: undefined reference to `void imresh::algorithms::complexNormElementwise<float, float [2]>(float* const&, float const (* const&) [2], unsigned int const&)'
collect2: Fehler: ld gab 1 als Ende-Status zurück

Right now I have no idea where to start investigating into this. As this is not game breaking at the moment and this is your code, I'd like to ask you, @mxmlnkn to look into this when you have some sparse time. But as this library is intended to be fast, compiler optimizations should be possible at least right before the final release.

Maybe it's just an error in the CMakeLists.txt that I don't get right now so the first step could be that you could try your old Makefiles with enabled optimizations.

mxmlnkn commented 8 years ago

compiler optimizations leading to undefined references Oo?. I don't even know where to start with that... very weird problem ... My guess is that higher compiler optimizations optimize my "__instantiateAllTempaltes" function away. ... Meaning It would be necessary do write out the template declarations instead of letting them derive implicitely by the compiler. Will take a look at this soon.

Ferruck commented 8 years ago

My first thought was template problems, too.

mxmlnkn commented 8 years ago

Don't know how to reproduce this. When adding "-O3" to

list(APPEND CMAKE_CXX_FLAGS "-O3 -std=c++11 -fPIC ${OpenMP_CXX_FLAGS}")

it still works

Ferruck commented 8 years ago

Strange, doesn't work for me.

Ferruck commented 8 years ago

Works now with your changes from #12 . Closing.

mxmlnkn commented 8 years ago

Another possible solution to this may had have been adding

extern "C"

This issue becomes relevant again, because alpaka doesn't like explicit template instantiations, see https://github.com/ComputationalRadiationPhysics/alpaka/issues/183#issuecomment-190737087 A solution to the problem in that comment is using a dummy function with CUPLA_KERNEL calls, implicitely instantiating only the device version of a kernel, not both, host and device.