FALCONN-LIB / FALCONN

FAst Lookups of Cosine and Other Nearest Neighbors (based on fast locality-sensitive hashing)
http://falconn-lib.org/
MIT License
1.14k stars 194 forks source link

Segmentation fault inside Python wrapper under cygwin #30

Closed ilyaraz closed 8 years ago

ilyaraz commented 8 years ago

http://pastebin.com/pMLaBwsg

At the same time, random_benchmark.py works well.

ludwigschmidt commented 8 years ago

Is the dtype incorrect (see the error message before the segfault)? I guess np.random.randn returns a double-precision array by default.

Edit: I should have read the pastebin more carefully :-) What happens when you store the query into a variable and then use it as a parameter?

ilyaraz commented 8 years ago

It's more interesting than that. It seems to fail in REPL and within py.test but seems to work when run as a simple script.

For instance: http://pastebin.com/PVmsLrhz fails when run via py.test and works when run via python.

ludwigschmidt commented 8 years ago

I guess in some cases the temporary float32 array gets garbage-collected?

ilyaraz commented 8 years ago

Does not seem to be the case. But segfault disappears if I try to print anything withing the C++ code...

ludwigschmidt commented 8 years ago

That sounds like there could be some memory corruption going on. Does valgrind work under cygwin?

ilyaraz commented 8 years ago

No, but I'll run it under Linux tomorrow.

ilyaraz commented 8 years ago

The bug is now somewhat better localized:

ilyaraz commented 8 years ago

The bug disappears if one removes -march=native!

ilyaraz commented 8 years ago

More specifically, -mavx seems to trigger it. Depending on whether we have -O0 or not, it either fails in the preprocessing, or in the query.

ilyaraz commented 8 years ago

FFHT does not seem to responsible, even if we run a version without AVX, the code still fails.

ilyaraz commented 8 years ago

Ah, with -O0 it fails in gdb! Victory!!!!

Error: http://pastebin.com/AbeNXeT0 Backtrace: http://pastebin.com/PbC8q0R0

ilyaraz commented 8 years ago

This is Eigen vs. Cygwin, in the end. The following code works without -mavx and fails with -mavx (in both cases compiled with -O0):

#include <Eigen/Dense>
#include <iostream>

typedef Eigen::VectorXf vec;

int main() {
  vec a(8);
  vec b(8);
  a = a.cwiseProduct(b);
  std::cout << a[0] << std::endl;
  return 0;
}
ilyaraz commented 8 years ago

Filed a bug report: http://eigen.tuxfamily.org/bz/show_bug.cgi?id=1146

ilyaraz commented 8 years ago

Looks like it's a Cygwin's bug. The following code fails:

#include "immintrin.h"

__m256 ff(float *buf) {
  return _mm256_load_ps(buf);
}

int main() {
  float *buf = new float[8];
  __m256 tt = ff(buf);
  return 0;
}
ludwigschmidt commented 8 years ago

Is the code supposed to work? The array might not be 32-byte aligned.

Also, does the issue go away with the stable Eigen 3.2?

Thanks for looking into this!

On Thursday, January 14, 2016, Ilya Razenshteyn notifications@github.com wrote:

Looks like it's a Cygwin's bug. The following code fails: '''c++

include

include "immintrin.h"

using namespace std;

__m256 ff(const float *buf) { return _mm256_load_ps(buf); }

int main() { float *buf = new float[8]; cout << "before" << endl; __m256 tt = ff(buf); cout << "after" << endl; delete[] buf; return 0; } '''

— Reply to this email directly or view it on GitHub https://github.com/FALCONN-LIB/FALCONN/issues/30#issuecomment-171654394.

ilyaraz commented 8 years ago

What will you say about this then?

#include "immintrin.h"

__m256 routine(void) {
  __m256 aux;
  return aux;
}

int main(void) {
  void *buf = malloc(1);
  __m256 res = routine();
  return 0;
}
ludwigschmidt commented 8 years ago

I don't know about the semantics of returning intrinsic locations. I guess the 1-byte *buf is crucial to make things fail?

ilyaraz commented 8 years ago

Yes it is. Moreover, it's crucial that the pointer buf is allocated on the stack (rather than as a global variable).

Not sure what's happening there.

ludwigschmidt commented 8 years ago

I guess buf somehow messes up the alignment of res. Maybe the stack frame is 32-byte aligned but buf is put there before res, which then leads to the alignment error.

Just to avoid confusion: this works on Linux, correct? I just tested it on my Mac and it doesn't seem to give segmentation faults.

ilyaraz commented 8 years ago

It works on Linux (on towhee, at least). It could have to do with https://cygwin.com/ml/cygwin-apps/2015-11/msg00073.html .

ludwigschmidt commented 8 years ago

This sounds like AVX is partially broken on Cygwin?

ilyaraz commented 8 years ago

Could be. Let's confirm this, and, if this is the case, declare that we don't support FALCONN for Cygwin. We should also give MinGW (http://www.mingw.org/) a try.

ludwigschmidt commented 8 years ago

If AVX on Cygwin is indeed not working, I'm also OK with dropping support for Cygwin.

For MinGW, the 64-bit fork might also be worth a look: http://mingw-w64.org/doku.php

I guess the most native compiler for Windows is the Visual Studio compiler. But I don't know how easy it would be for us to support it.

ludwigschmidt commented 8 years ago

Is MinGW (not the 64-bit fork) actively being developed? http://www.mingw.org/wiki/GCCStatus says "Latest official release of GCC from the MinGW project is GCC 3.4.5."

ilyaraz commented 8 years ago

Looks like this is what happens: https://stackoverflow.com/questions/5983389/how-to-align-stack-at-32-byte-boundary-in-gcc

I guess we have no choice other than to drop the Windows support...

ilyaraz commented 8 years ago

Though, I guess, one can disable AVX within Eigen if one detects Windows (we don't want to disable AVX overall, since FFHT should still work, and using AVX there is crucial). This can probably be done by defining a couple of macroses within falconn_global.h or something like this.

ludwigschmidt commented 8 years ago

Yes, we could give that a try. Alternatively, we could try supporting VC++ to get AVX support. I'm not sure what people mainly use for C++ development on Windows these days.

ilyaraz commented 8 years ago

Resolved in 6283f9cbb014555a2eef910f8aa91a2f08e3ff4c