stan-dev / stan

Stan development repository. The master branch contains the current release. The develop branch contains the latest stable development. See the Developer Process Wiki for details.
https://mc-stan.org
BSD 3-Clause "New" or "Revised" License
2.56k stars 365 forks source link

Header functions not defined inline | multiple definition linker errors #1149

Closed mcobzarenco closed 8 years ago

mcobzarenco commented 9 years ago

I am working on a library which uses the auto differentiation component of stan. In src/stan/agrad all functions seem to be defined in header files, but are not marked inline. This means they can only have 1 definition across all translation units. I.e. one can only include agrad/autodiff.hpp in only one cpp file or otherwise get a lot of multiple definiton errors. E.g.

multiple definition of stan::math::gradRegIncBeta(double&, double&, double, double, double, double, double, double, double)' etc. etc. etc.

It may be that I don't understand how stan is supposed to be built, but function definitions in header files not marked inline or in an anonymous namespace are a big NO-NO. Does someone know why they are not marked inline or are defined in their own cpp files? How I can use agrad as a separate lib if I can only include agrad/autodiff in one .cpp file?

Marius

syclik commented 9 years ago

Marius, our apologies. We still haven't really cleaned up the automatic differentiation library to be completely standalone. Good news, though... it's easy enough to call.

Instead of including whatever header files you're including, include this one: src/stan/model/model_header.hpp

So, just this line: #include <stan/model/model_header.hpp>

It brings in a little too much at the moment, but that's been stable.

Regarding function definitions not marked inline, do you have a reference that explains what you wrote? Unfortunately, you're walking into heavily templated C++, so some of the traditional ways of compiling translation units doesn't really hold. If there are better/correct ways to do things, we're happy to do it if we can get a decent explanation.

mbrubake commented 9 years ago

Based on the function mentioned I suspect the issue is not with templating, but rather with some of the non-templated functions used as helpers. For instance, see stan/prob/internal_math.hpp which defines a number of non-templated functions which really should be marked as inline or compiled into an object file.

Marius, the issue is that Stan only ever has a single translation unit (a single model) so these issues have never really come up for us. Fixing it shouldn't be too hard and a pull request which simply added inline to a bunch of functions would likely be quickly reviewed.

On Mon, Nov 24, 2014 at 1:57 PM, Daniel Lee notifications@github.com wrote:

Marius, our apologies. We still haven't really cleaned up the automatic differentiation library to be completely standalone. Good news, though... it's easy enough to call.

Instead of including whatever header files you're including, include this one: src/stan/model/model_header.hpp

So, just this line:

include <stan/model/model_header.hpp>

It brings in a little too much at the moment, but that's been stable.

Regarding function definitions not marked inline, do you have a reference that explains what you wrote? Unfortunately, you're walking into heavily templated C++, so some of the traditional ways of compiling translation units doesn't really hold. If there are better/correct ways to do things, we're happy to do it if we can get a decent explanation.

— Reply to this email directly or view it on GitHub https://github.com/stan-dev/stan/issues/1149#issuecomment-64244144.

mcobzarenco commented 9 years ago

@syclik Thanks for the pointers! I managed to use agard sucessfully, my only problem was when linking multiple object files together. I get a lot of "multiple definition" errors, one for every function I use. stan::math::gradRegIncBeta was just an example. The non-templated functions are the problem as @mbrubake correctly pointed out. It is essentially a consequence of the one definition rule of C++:

https://en.wikipedia.org/wiki/One_Definition_Rule https://stackoverflow.com/questions/453372/writing-function-definition-in-header-files-in-c https://stackoverflow.com/questions/5057021/why-are-c-inline-functions-in-the-header

It is easy to reproduce the problem in general with 4 files. f.hpp defines a function f (this would be the stan header):

/* f.hpp */
#pragma once

int f(int x) {
    return x + 1;
}

g.hpp and g.cpp use f.hpp to define a function g:

/* g.hpp */
#pragma once

int g(int x);
/* g.cpp */
#include "f.hpp"
#include "g.hpp"

int g(int x) {
  return f(x) * f(x);
}

The main file:

#include "f.hpp"
#include "g.hpp"

int main() {
  return f(1) + g(1);
}

g.cpp and main.cpp will be compiled into 2 objects linked together to create the executable:

➜  g++ -o g.o -c g.cpp
➜  g++ -o main.o -c main.cpp
➜  g++ -o main main.o g.o
g.o: In function `f(int)':
g.cpp:(.text+0x0): multiple definition of `f(int)'
main.o:main.cpp:(.text+0x0): first defined here
collect2: error: ld returned 1 exit status

If f is defined inline in f.hpp then the problem goes away. inline tells the compiler to allow for a symbol to have multiple definitions in different translation units as long as the definitions are identical.

@mbrubake I will submit a PR to fix it. There is also whitespace at the end of the lines in some of the files, I hope no one minds if I trim it whilst at it?

bob-carpenter commented 9 years ago

On Nov 24, 2014, at 5:30 PM, Marius Cobzarenco notifications@github.com wrote:

@syclik Thanks for the pointers! I managed to use agard sucessfully, my only problem was when linking multiple object files together. I get a lot of "multiple definition" errors, one for every function I use. stan::math::gradRegIncBeta was just an example. The non-templated functions are the problem as @mbrubake correctly pointed out. It is essentially a consequence of the one definition rule of C++:

https://en.wikipedia.org/wiki/One_Definition_Rule https://stackoverflow.com/questions/453372/writing-function-definition-in-header-files-in-c https://stackoverflow.com/questions/5057021/why-are-c-inline-functions-in-the-header

It is easy to reproduce the problem in general with 4 files. f.hpp defines a function f (this would be the stan header):

/* f.hpp */

pragma once

int f(int x) { return x + 1; }

g.hpp and g.cpp use f.hpp to define a function g:

/* g.hpp */

pragma once

int g(int x);

/* g.cpp */

include "f.hpp"

include "g.hpp"

int g(int x) { return f(x) * f(x); }

The main file:

include "f.hpp"

include "g.hpp"

int main() { return f(1) + g(1); }

g.cpp and main.cpp will be compiled into 2 objects linked together to create the executable:

➜ g++ -o g.o -c g.cpp ➜ g++ -o main.o -c main.cpp ➜ g++ -o main main.o g.o g.o: In function f(int)': g.cpp:(.text+0x0): multiple definition off(int)' main.o:main.cpp:(.text+0x0): first defined here collect2: error: ld returned 1 exit status

If f is defined inline in f.hpp then the problem goes away. inline tells the compiler to allow for a symbol to have multiple definitions in different translation units as long as the definitions are identical.

Exactly.

The alternative is to just go header-only. Then there's only one translation unit. It seems to be the favored approach of both the Boost and Eigen libs.

@mbrubake I will submit PR to fix it. There is also whitespace at the end of the lines in some of the files, I hope no one minds if I trim it whilst at it?

Please do. We write cleanup scripts from time to time to get rid of formatting issues programatically (usually new programmers introducing tabs!).

mcobzarenco commented 9 years ago

The alternative is to just go header-only. Then there's only one translation unit. It seems to be the favored approach of both the Boost and Eigen libs.

I am not debating that, you should be header only, but Boost and Eigen define all functions inline, so users of the library are not forced to have only 1 translation unit. There's no cost either, as the compiler decides by its own when to inline functions. It's really just an unfortunate quirk of how C++ builds

Marius

syclik commented 9 years ago

@mcobzarenco, thanks! That was definitely helpful. (it wasn't defined until the 2003 spec?)

So... were you able to get the autodiff working?

mcobzarenco commented 9 years ago

@mcobzarenco, thanks! That was definitely helpful. (it wasn't defined until the 2003 spec?)

That's when they put it in the standard.

So... were you able to get the autodiff working?

I did get autodiff working, it's really cool. I am using to write a library for deep neural nets which should support training with SGD and Hessian-free optimisation. It's just few days old.

Marius

bob-carpenter commented 9 years ago

Great. Let us know if you have something we can share.

If your deep neural nets have repeated subgraphs in the graphical model sense (or repeated computations in the imperative probabilistic programming sense), it can be a huge speed boost to implement analytic gradients for the subgraph as a function.

We're going to be doing some generalized linear model ones soon, including logistic regression, which might be helpful to speed up neural net implementations in Stan.

On Nov 24, 2014, at 6:16 PM, Marius Cobzarenco notifications@github.com wrote:

@mcobzarenco, thanks! That was definitely helpful. (it wasn't defined until the 2003 spec?)

That's when they put it in the standard.

So... were you able to get the autodiff working?

I did get autodiff working, it's really cool. I am using to write a library for deep neural nets which should support training with SGD and Hessian-free optimisation. It's a few days old, I am writing it for a research project.

Marius

— Reply to this email directly or view it on GitHub.

bgoodri commented 8 years ago

I think I ran into this problem compiling rstanarm. If I made each .stan program into a separate .o file and then combined them together into a shared object, there were linker errors involving log_sum_exp IIRC. So, I just glued all the .cpp files into one .cpp file and the linker errors went away. In other words, it doesn't affect me really, but the bug is probably still there.

syclik commented 8 years ago

Please let me know if you see that again. Or a way to test it.

On Mar 4, 2016, at 6:25 PM, bgoodri notifications@github.com wrote:

I think I ran into this problem compiling rstanarm. If I made each .stan program into a separate .o file and then combined them together into a shared object, there were linker errors involving log_sum_exp IIRC. So, I just glued all the .cpp files into one .cpp file and the linker errors went away. In other words, it doesn't affect me really, but the bug is probably still there.

— Reply to this email directly or view it on GitHub.

bgoodri commented 8 years ago

Just compile any two Stan programs.

library(rstan) funnel <- stan_demo("funnel", chains = 0) funnel_reparam <- stan_demo("funnel_reparam", chains = 0)

Then, in a shell, change directories to whatever

tempdir()

evaluates to and try

R CMD SHLIB --output=test *.o