mrkn / pycall.rb

Calling Python functions from the Ruby language
MIT License
1.06k stars 75 forks source link

How to debug failing calls? #79

Closed xxx closed 5 years ago

xxx commented 5 years ago

Hi,

I really like this gem, but am running into an issue where a Python call to spaCy is hanging my entire app. This happens only when I make that via through a Sinatra endpoint, while it's running. The tests (including the integration tests that go through the endpoint), all pass, and making this call in irb/pry console also works correctly. It's literally only when it's called via the endpoint. This is clearly a huge wtf.

I've already tried switching out Sinatra, since I'm at the point of trying anything. Didn't help.

This app has a second endpoint, also using pycall to call into fastText, which works fine in all cases, so I'm guessing this is something related to spaCy, but I want to be able to debug it, and I don't know how. I lose the trail at return self.__send__(name, *args) in PyCall::PyObjectWrapper#method_missing

Is there any way at all to get more info about what's happening here?

mrkn commented 5 years ago

Thank you for using pycall.

I cannot understand the issue you encountered without the reproducible code. Please show the code, the environment where the code was running, and the outcome you got such as error messages.

xxx commented 5 years ago

I've put up a small app that recreates the hang at https://github.com/xxx/pycall_hang

I'm just looking for pointers on how to debug the hang, since it only happens when I go through the web endpoint.

xxx commented 5 years ago

I've done some additional work on this, trying different Python versions (both 2.x and 3.x), different PyCall versions, different Spacy versions, replacing Sinatra with Roda, replacing Puma with Webrick and Thin, all to no effect. This may be GVL/ref count related, but not sure. There's a dump of the thread stacks at https://gist.github.com/xxx/ac2ae0b8c97359e5ca2b39a528393a82

The one solution I've found so far that resolves is to extract this functionality out to a separate script, and shell out to it from the web app, which is not incredible, but might be the only way forward at the moment.

xxx commented 5 years ago

For whatever reason, I had never tried Unicorn for the app server. It now works as expected, so maybe some issue in multi-threaded environments. I'll close this out.

WRT my original question - using gdb on the hanging Ruby process will get you where you want, since the Python calls are all within the same process. Thanks again for this gem!

mrkn commented 5 years ago

@xxx I'm sorry for my response to be late. Your understanding is correct. Now pycall doesn't support multithreading.

ricardovj commented 4 years ago

I'm having the exact same problem with puma!

andreaslillebo commented 4 years ago

I'm having the exact same problem with puma!

@ricardovj, the problem seems to be related to multithreading, which as previously mentioned is not yet supported.

For me, setting max_threads_count = 1 in the puma initializer solves the problem with the server freezing, though this might affect the server's performance when handling concurrent requests.