Open bgoodri opened 8 years ago
@bob-carpenter Do you have any idea about this. The link above has a "bus error" on 32bit SPARC which is described here http://stackoverflow.com/questions/1892566/c-bus-error-in-sparc-arcitecture It doesn't seem to happen with sampling, only with optimizing and ADVI. So, it is likely not anything with Eigen or Stan Math. I haven't been able to get any sort of test failure by adding -fno-strict-aliasing to the compiler flags.
I'm afraid not.
That stackoverflow comment points to alignment, and we're allocating memory and aligning it ourselves for all the gradient calcs, but that's going to be the same operations in all of those systems.
Is there any way to run tests on this without it being on the critical path for a CRAN submission? The alignment code is all in stan/math/memory
In particular, here:
// FIXME: enforce alignment
// big fun to inline, but only called twice
inline char* eight_byte_aligned_malloc(size_t size) {
char* ptr = static_cast<char*>(malloc(size));
if (!ptr) return ptr; // malloc failed to alloc
if (!is_aligned(ptr, 8U)) {
std::stringstream s;
s << "invalid alignment to 8 bytes, ptr="
<< reinterpret_cast<uintptr_t>(ptr)
<< std::endl;
throw std::runtime_error(s.str());
}
return ptr;
}
And here's the memory alignment test:
/**
* Return <code>true</code> if the specified pointer is aligned
* on the number of bytes.
*
* This doesn't really make sense other than for powers of 2.
*
* @param ptr Pointer to test.
* @param bytes_aligned Number of bytes of alignment required.
* @return <code>true</code> if pointer is aligned.
* @tparam Type of object to which pointer points.
*/
template <typename T>
bool is_aligned(T* ptr, unsigned int bytes_aligned) {
return (reinterpret_cast<uintptr_t>(ptr) % bytes_aligned) == 0U;
}
And for that, see:
So I'm not even sure I did that right, though, because I'm pushing in char* rather than void. So maybe removing teh template param and replacing with void. I also have no idea what "restrict" does in the upvoted answer to that stackoverflow.
On Feb 7, 2016, at 9:16 PM, bgoodri notifications@github.com wrote:
@bob-carpenter Do you have any idea about this. The link above has a "bus error" on 32bit SPARC which is described here http://stackoverflow.com/questions/1892566/c-bus-error-in-sparc-arcitecture It doesn't seem to happen with sampling, only with optimizing and ADVI. So, it is likely not anything with Eigen or Stan Math. I haven't been able to get any sort of test failure by adding -fno-strict-aliasing to the compiler flags.
— Reply to this email directly or view it on GitHub.
I think we would need to emulate a SPARC environment with QEMU or something. But is there any conceivable way that the memory alignment thing for autodiff could cause a bus error with LBFGS and ADVI but not MCMC? The fact that MCMC works on SPARC makes me think it is in the writer or something else that behaves differently depending on the algorithm.
On Sun, Feb 7, 2016 at 11:00 PM, Bob Carpenter notifications@github.com wrote:
I'm afraid not.
That stackoverflow comment points to alignment, and we're allocating memory and aligning it ourselves for all the gradient calcs, but that's going to be the same operations in all of those systems.
Is there any way to run tests on this without it being on the critical path for a CRAN submission? The alignment code is all in stan/math/memory
In particular, here:
// FIXME: enforce alignment // big fun to inline, but only called twice inline char* eight_byte_aligned_malloc(size_t size) { char* ptr = static_cast<char*>(malloc(size)); if (!ptr) return ptr; // malloc failed to alloc if (!is_aligned(ptr, 8U)) { std::stringstream s; s << "invalid alignment to 8 bytes, ptr=" << reinterpret_cast<uintptr_t>(ptr) << std::endl; throw std::runtime_error(s.str()); } return ptr; }
And here's the memory alignment test:
/** * Return <code>true</code> if the specified pointer is aligned * on the number of bytes. * * This doesn't really make sense other than for powers of 2. * * @param ptr Pointer to test. * @param bytes_aligned Number of bytes of alignment required. * @return <code>true</code> if pointer is aligned. * @tparam Type of object to which pointer points. */ template <typename T> bool is_aligned(T* ptr, unsigned int bytes_aligned) { return (reinterpret_cast<uintptr_t>(ptr) % bytes_aligned) == 0U; }
And for that, see:
So I'm not even sure I did that right, though, because I'm pushing in char* rather than void. So maybe removing teh template param and replacing with void. I also have no idea what "restrict" does in the upvoted answer to that stackoverflow.
- Bob
On Feb 7, 2016, at 9:16 PM, bgoodri notifications@github.com wrote:
@bob-carpenter Do you have any idea about this. The link above has a "bus error" on 32bit SPARC which is described here
http://stackoverflow.com/questions/1892566/c-bus-error-in-sparc-arcitecture It doesn't seem to happen with sampling, only with optimizing and ADVI. So, it is likely not anything with Eigen or Stan Math. I haven't been able to get any sort of test failure by adding -fno-strict-aliasing to the compiler flags.
— Reply to this email directly or view it on GitHub.
— Reply to this email directly or view it on GitHub https://github.com/stan-dev/rstan/issues/252#issuecomment-181185239.
In which case it'd probably be a library issue. These pointers are the only place we directly mess with memory ourselves.
Of course, Eigen and the standard template library have to, as well.
On Feb 7, 2016, at 11:05 PM, bgoodri notifications@github.com wrote:
I think we would need to emulate a SPARC environment with QEMU or something. But is there any conceivable way that the memory alignment thing for autodiff could cause a bus error with LBFGS and ADVI but not MCMC? The fact that MCMC works on SPARC makes me think it is in the writer or something else that behaves differently depending on the algorithm.
On Sun, Feb 7, 2016 at 11:00 PM, Bob Carpenter notifications@github.com wrote:
I'm afraid not.
That stackoverflow comment points to alignment, and we're allocating memory and aligning it ourselves for all the gradient calcs, but that's going to be the same operations in all of those systems.
Is there any way to run tests on this without it being on the critical path for a CRAN submission? The alignment code is all in stan/math/memory
In particular, here:
// FIXME: enforce alignment // big fun to inline, but only called twice inline char* eight_byte_aligned_malloc(size_t size) { char* ptr = static_cast<char*>(malloc(size)); if (!ptr) return ptr; // malloc failed to alloc if (!is_aligned(ptr, 8U)) { std::stringstream s; s << "invalid alignment to 8 bytes, ptr=" << reinterpret_cast<uintptr_t>(ptr) << std::endl; throw std::runtime_error(s.str()); } return ptr; }
And here's the memory alignment test:
/** * Return <code>true</code> if the specified pointer is aligned * on the number of bytes. * * This doesn't really make sense other than for powers of 2. * * @param ptr Pointer to test. * @param bytes_aligned Number of bytes of alignment required. * @return <code>true</code> if pointer is aligned. * @tparam Type of object to which pointer points. */ template <typename T> bool is_aligned(T* ptr, unsigned int bytes_aligned) { return (reinterpret_cast<uintptr_t>(ptr) % bytes_aligned) == 0U; }
And for that, see:
So I'm not even sure I did that right, though, because I'm pushing in char* rather than void. So maybe removing teh template param and replacing with void. I also have no idea what "restrict" does in the upvoted answer to that stackoverflow.
- Bob
On Feb 7, 2016, at 9:16 PM, bgoodri notifications@github.com wrote:
@bob-carpenter Do you have any idea about this. The link above has a "bus error" on 32bit SPARC which is described here
http://stackoverflow.com/questions/1892566/c-bus-error-in-sparc-arcitecture It doesn't seem to happen with sampling, only with optimizing and ADVI. So, it is likely not anything with Eigen or Stan Math. I haven't been able to get any sort of test failure by adding -fno-strict-aliasing to the compiler flags.
— Reply to this email directly or view it on GitHub.
— Reply to this email directly or view it on GitHub https://github.com/stan-dev/rstan/issues/252#issuecomment-181185239.
— Reply to this email directly or view it on GitHub.
could it have to do with an empty stream for printing? I know that's given us problems in the past.
On Sun, Feb 7, 2016 at 11:46 PM, Bob Carpenter notifications@github.com wrote:
In which case it'd probably be a library issue. These pointers are the only place we directly mess with memory ourselves.
Of course, Eigen and the standard template library have to, as well.
On Feb 7, 2016, at 11:05 PM, bgoodri notifications@github.com wrote:
I think we would need to emulate a SPARC environment with QEMU or something. But is there any conceivable way that the memory alignment thing for autodiff could cause a bus error with LBFGS and ADVI but not MCMC? The fact that MCMC works on SPARC makes me think it is in the writer or something else that behaves differently depending on the algorithm.
On Sun, Feb 7, 2016 at 11:00 PM, Bob Carpenter <notifications@github.com
wrote:
I'm afraid not.
That stackoverflow comment points to alignment, and we're allocating memory and aligning it ourselves for all the gradient calcs, but that's going to be the same operations in all of those systems.
Is there any way to run tests on this without it being on the critical path for a CRAN submission? The alignment code is all in stan/math/memory
In particular, here:
// FIXME: enforce alignment // big fun to inline, but only called twice inline char* eight_byte_aligned_malloc(size_t size) { char* ptr = static_cast<char*>(malloc(size)); if (!ptr) return ptr; // malloc failed to alloc if (!is_aligned(ptr, 8U)) { std::stringstream s; s << "invalid alignment to 8 bytes, ptr=" << reinterpret_cast<uintptr_t>(ptr) << std::endl; throw std::runtime_error(s.str()); } return ptr; }
And here's the memory alignment test:
/** * Return <code>true</code> if the specified pointer is aligned * on the number of bytes. * * This doesn't really make sense other than for powers of 2. * * @param ptr Pointer to test. * @param bytes_aligned Number of bytes of alignment required. * @return <code>true</code> if pointer is aligned. * @tparam Type of object to which pointer points. */ template <typename T> bool is_aligned(T* ptr, unsigned int bytes_aligned) { return (reinterpret_cast<uintptr_t>(ptr) % bytes_aligned) == 0U; }
And for that, see:
So I'm not even sure I did that right, though, because I'm pushing in char* rather than void. So maybe removing teh template param and replacing with void. I also have no idea what "restrict" does in the upvoted answer to that stackoverflow.
- Bob
On Feb 7, 2016, at 9:16 PM, bgoodri notifications@github.com wrote:
@bob-carpenter Do you have any idea about this. The link above has a "bus error" on 32bit SPARC which is described here
http://stackoverflow.com/questions/1892566/c-bus-error-in-sparc-arcitecture
It doesn't seem to happen with sampling, only with optimizing and ADVI. So, it is likely not anything with Eigen or Stan Math. I haven't been able to get any sort of test failure by adding -fno-strict-aliasing to the compiler flags.
— Reply to this email directly or view it on GitHub.
— Reply to this email directly or view it on GitHub https://github.com/stan-dev/rstan/issues/252#issuecomment-181185239.
— Reply to this email directly or view it on GitHub.
— Reply to this email directly or view it on GitHub https://github.com/stan-dev/rstan/issues/252#issuecomment-181194591.
I asked Daniel and he clarified he meant a null (0) pointer by "empty stream".
On Feb 8, 2016, at 12:43 AM, Daniel Lee notifications@github.com wrote:
could it have to do with an empty stream for printing? I know that's given us problems in the past.
On Sun, Feb 7, 2016 at 11:46 PM, Bob Carpenter notifications@github.com wrote:
In which case it'd probably be a library issue. These pointers are the only place we directly mess with memory ourselves.
Of course, Eigen and the standard template library have to, as well.
On Feb 7, 2016, at 11:05 PM, bgoodri notifications@github.com wrote:
I think we would need to emulate a SPARC environment with QEMU or something. But is there any conceivable way that the memory alignment thing for autodiff could cause a bus error with LBFGS and ADVI but not MCMC? The fact that MCMC works on SPARC makes me think it is in the writer or something else that behaves differently depending on the algorithm.
On Sun, Feb 7, 2016 at 11:00 PM, Bob Carpenter <notifications@github.com
wrote:
I'm afraid not.
That stackoverflow comment points to alignment, and we're allocating memory and aligning it ourselves for all the gradient calcs, but that's going to be the same operations in all of those systems.
Is there any way to run tests on this without it being on the critical path for a CRAN submission? The alignment code is all in stan/math/memory
In particular, here:
// FIXME: enforce alignment // big fun to inline, but only called twice inline char* eight_byte_aligned_malloc(size_t size) { char* ptr = static_cast<char*>(malloc(size)); if (!ptr) return ptr; // malloc failed to alloc if (!is_aligned(ptr, 8U)) { std::stringstream s; s << "invalid alignment to 8 bytes, ptr=" << reinterpret_cast<uintptr_t>(ptr) << std::endl; throw std::runtime_error(s.str()); } return ptr; }
And here's the memory alignment test:
/** * Return <code>true</code> if the specified pointer is aligned * on the number of bytes. * * This doesn't really make sense other than for powers of 2. * * @param ptr Pointer to test. * @param bytes_aligned Number of bytes of alignment required. * @return <code>true</code> if pointer is aligned. * @tparam Type of object to which pointer points. */ template <typename T> bool is_aligned(T* ptr, unsigned int bytes_aligned) { return (reinterpret_cast<uintptr_t>(ptr) % bytes_aligned) == 0U; }
And for that, see:
So I'm not even sure I did that right, though, because I'm pushing in char* rather than void. So maybe removing teh template param and replacing with void. I also have no idea what "restrict" does in the upvoted answer to that stackoverflow.
- Bob
On Feb 7, 2016, at 9:16 PM, bgoodri notifications@github.com wrote:
@bob-carpenter Do you have any idea about this. The link above has a "bus error" on 32bit SPARC which is described here
http://stackoverflow.com/questions/1892566/c-bus-error-in-sparc-arcitecture
It doesn't seem to happen with sampling, only with optimizing and ADVI. So, it is likely not anything with Eigen or Stan Math. I haven't been able to get any sort of test failure by adding -fno-strict-aliasing to the compiler flags.
— Reply to this email directly or view it on GitHub.
— Reply to this email directly or view it on GitHub https://github.com/stan-dev/rstan/issues/252#issuecomment-181185239.
— Reply to this email directly or view it on GitHub.
— Reply to this email directly or view it on GitHub https://github.com/stan-dev/rstan/issues/252#issuecomment-181194591.
— Reply to this email directly or view it on GitHub.
That is a possibility. Not that I can rule out anything as a possibility. Hopefully CRAN will allow us to link to the SegFault library on SPARC so that more information is provided in the backtrace.
On Mon, Feb 8, 2016 at 4:43 PM, Bob Carpenter notifications@github.com wrote:
I asked Daniel and he clarified he meant a null (0) pointer by "empty stream".
- Bob
On Feb 8, 2016, at 12:43 AM, Daniel Lee notifications@github.com wrote:
could it have to do with an empty stream for printing? I know that's given us problems in the past.
On Sun, Feb 7, 2016 at 11:46 PM, Bob Carpenter <notifications@github.com
wrote:
In which case it'd probably be a library issue. These pointers are the only place we directly mess with memory ourselves.
Of course, Eigen and the standard template library have to, as well.
On Feb 7, 2016, at 11:05 PM, bgoodri notifications@github.com wrote:
I think we would need to emulate a SPARC environment with QEMU or something. But is there any conceivable way that the memory alignment thing for autodiff could cause a bus error with LBFGS and ADVI but not MCMC? The fact that MCMC works on SPARC makes me think it is in the writer or something else that behaves differently depending on the algorithm.
On Sun, Feb 7, 2016 at 11:00 PM, Bob Carpenter < notifications@github.com
wrote:
I'm afraid not.
That stackoverflow comment points to alignment, and we're allocating memory and aligning it ourselves for all the gradient calcs, but that's going to be the same operations in all of those systems.
Is there any way to run tests on this without it being on the critical path for a CRAN submission? The alignment code is all in stan/math/memory
In particular, here:
// FIXME: enforce alignment // big fun to inline, but only called twice inline char* eight_byte_aligned_malloc(size_t size) { char* ptr = static_cast<char*>(malloc(size)); if (!ptr) return ptr; // malloc failed to alloc if (!is_aligned(ptr, 8U)) { std::stringstream s; s << "invalid alignment to 8 bytes, ptr=" << reinterpret_cast<uintptr_t>(ptr) << std::endl; throw std::runtime_error(s.str()); } return ptr; }
And here's the memory alignment test:
/** * Return <code>true</code> if the specified pointer is aligned * on the number of bytes. * * This doesn't really make sense other than for powers of 2. * * @param ptr Pointer to test. * @param bytes_aligned Number of bytes of alignment required. * @return <code>true</code> if pointer is aligned. * @tparam Type of object to which pointer points. */ template <typename T> bool is_aligned(T* ptr, unsigned int bytes_aligned) { return (reinterpret_cast<uintptr_t>(ptr) % bytes_aligned) == 0U; }
And for that, see:
So I'm not even sure I did that right, though, because I'm pushing in char* rather than void. So maybe removing teh template param and replacing with void. I also have no idea what "restrict" does in the upvoted answer to that stackoverflow.
- Bob
On Feb 7, 2016, at 9:16 PM, bgoodri notifications@github.com wrote:
@bob-carpenter Do you have any idea about this. The link above has a "bus error" on 32bit SPARC which is described here
http://stackoverflow.com/questions/1892566/c-bus-error-in-sparc-arcitecture
It doesn't seem to happen with sampling, only with optimizing and ADVI. So, it is likely not anything with Eigen or Stan Math. I haven't been able to get any sort of test failure by adding -fno-strict-aliasing to the compiler flags.
— Reply to this email directly or view it on GitHub.
— Reply to this email directly or view it on GitHub < https://github.com/stan-dev/rstan/issues/252#issuecomment-181185239>.
— Reply to this email directly or view it on GitHub.
— Reply to this email directly or view it on GitHub https://github.com/stan-dev/rstan/issues/252#issuecomment-181194591.
— Reply to this email directly or view it on GitHub.
— Reply to this email directly or view it on GitHub https://github.com/stan-dev/rstan/issues/252#issuecomment-181568436.
https://www.r-project.org/nosvn/R.check/r-patched-solaris-sparc/rstanarm-00check.html