Closed zhuanhao-wu closed 6 years ago
I can reproduce the second one, but not the first one.
It seems like this line alone:
y = NMatrix.new([3, 3],[[1, 2, 3], [4, 1, 0], [5, 5, 2]], dtype: :float64)
is enough to cause a segfault on quit.
So, when we run the line above, it gets up to here: https://github.com/SciRuby/nmatrix/blob/80765a529b78e3da8e918e9e1e8938d52c9f9e20/ext/nmatrix/ruby_nmatrix.c#L2801 at which point a Ruby exception is raised. This causes it to jump back up to the nearest catch block, which in this case is all the way up in pry. So it skips all the remainder of the C code, including the cleanup bits with all the nmunregister* stuff. I suspect, but I haven't confirmed, that this is what causes the crash on quit. The problem is that it's pretty non-trivial to catch a Ruby exception in C code. The other option is that we manually check all the types in the array before trying to convert to floats (or whatever). This has the downside of introducing an extra check that just duplicates the check that would happen anyway (the one that raises the exception in the first place).
Could it be because the array being passed in is not flat? On Tue, Feb 2, 2016 at 4:19 PM Will Levine notifications@github.com wrote:
So, when we run the line above, it gets up to here: https://github.com/SciRuby/nmatrix/blob/80765a529b78e3da8e918e9e1e8938d52c9f9e20/ext/nmatrix/ruby_nmatrix.c#L2801 at which point a Ruby exception is raised. This causes it to jump back up to the nearest catch block, which in this case is all the way up in pry. So it skips all the remainder of the C code, including the cleanup bits with all the nmunregister* stuff. I suspect, but I haven't confirmed, that this is what causes the crash on quit. The problem is that it's pretty non-trivial to catch a Ruby exception in C code. The other option is that we manually check all the types in the array before trying to convert to floats (or whatever). This has the downside of introducing an extra check that just duplicates the check that would happen anyway (the one that raises the exception in the first place).
— Reply to this email directly or view it on GitHub https://github.com/SciRuby/nmatrix/issues/452#issuecomment-178858558.
Well, that's why the exception gets raised in the first place. I suppose we could support this usecase, but the larger issue is how to clean up properly when an Ruby exception gets raised deep in C code. I suspect that the same issue could also turn up elsewhere in the codebase.
Oy. Yeah, when we designed it, we didn't have the registration and unregistration. I suppose we just have to check argument sanity as soon as stuff comes in, as much as possible.
Just for the record, I think my diagnosis in the last couple of comments is not entirely correct. For one thing, NM_CONSERVATIVE(nm_register_value(&arg));
should have no effect in the standard build, since this macro resolves to nothing unless NM_GC_CONSERVATIVE
is defined. However, I still think that the problem is due to the Ruby exception causing a jump across C code that does some necessary cleanup.
Closed because I'm not sure it still applies. Someone can open a new issue if they like.
Hi there, I was testing how dtype performed a while ago and I found that on my computer, the input below may cause the pry to crash. (You may paste these input directly twice into pry: for the first time, it executes without error, but for the second time, pry will crash)
After some more try, I found that, if you input the following code into pry:
It executes without errors. But if you then
quit
pry, pry will crash. (pry crashes with similar behavior as #449 , maybe a segfault happened..)Not sure this the problem with nmatrix or pry. Hopefully you can reproduce the errors on your machine.