Open giovanni-rosotti opened 7 years ago
I recently (in the last week or so) had that problem also, but it was when I was debugging the meshless so assumed the problem was on that end. But with gradh or gravity-only, then it's an issue. It's not a critical assertion, in the sense that particles that fail that assertion are wrong. It's just the tree-walk should calculate smoothed neighbours and direct-sum gravity neighbours differently so it means there is perhaps something wrong there. However, the fact there are differences between serial and parallel is definitely a worry (again). We'll have to investigate to see if the assertion is only just failing (due to floating point round-off) or properly failing due to wrong sorting or even NaNs.
I think it is a critical assertion because the particles are unsoftened in this function. Giovanni checked yesterday that it's neither NaN's or due to round off, so it really is an issue.
Ahh ok, yes. Sorry, I was thinking the other way around (not reading the code properly oops) where we were instead in the smoothed gravity function but computing for direct-gravity particles. But it's the opposite, so yes, it is a critical assertion!!
Yes, that's right, it's a serious issue. Which compiler were you using when you found the problem in the meshless? Was it intel 11.1 or another version?
No, just g++ (5.4) on my Mac. But like I said, I was debugging the meshless so was assuming it was a problem with that part. However, if I get that error again I'll post it here with more info if it helps narrow down the issue.
Very serious bug. If I just take the master branch (but with some very simple modifications to make it compile, see 005df4c in branch intel_bug), compile with intel (used v11.1) and run a gravity test (e.g. freefall or bossbodenheimer) I get a crash straight away due to this assertion failing. Tried to debug without success. It seems a parallelization error because if I run with only 1 thread it doesn't crash. Thankfully the problem doesn't come up with a newer version of the Intel compiler (v15) or with gcc, but it makes me wonder if it's a compiler problem or a problem in our code.