Closed tmm1 closed 12 years ago
+1 for Pygments::Socket.
Love this so much. :heart:
Way to not create a separate HTTP service for this btw. It's the cool way to solve these problems right now in case you didn't know.
Just to clarify, are you saying that you'd rather see the subprocess + pipe approach than external service + socket? I'm leaning towards that too.
@tanoku Yeah definitely.
@tanoku and @rtomayko Do you think external service over socket or tcp has flexibilities on deployment and load balancing? GitHub may have more features depend on Pygments in the future. Homogeneous workers are more diligent, aren't they?
tearing out the ffi usage of python should also fix github/github#4187
Full version of this has been implemented.
Present
Running a python VM inside the current ruby process continues to be problematic. There are reports of segfaults, problems with FFI, problems
rubypython
has findinglibpython
, and open bugs with multi-vm signal handling while inside python code.Some of these issues are specific to
Pygments::FFI
andrubypython
. But the alternative,Pygments::C
, is too immature to use in production and would at a minimum require added exception handling code.Past
pygments.rb
's predecessors,albino
andmultipygmentize
, suffered from a limited API and poor performance.multipygmentize
somewhat improved performance, but required additional work by the caller to make batch calls.The benchmark isolates this performance problem to python startup and pygments library loading cost. In
pygments.rb
, we pay this startup cost only once inPygments.start
.Future
The ideal implementation of
pygments.rb
then, is an API compatible interface that only pays startup cost once, but also provides isolation from the python code. Thus,Pygments::Popen
.Instead of a new process per invocation (like with
albino
), we keep a long-running python child and communicate with it over a pipe. To maintain the existing API and allow for future expansion, the protocol over the pipe can be simple bert-style RPC.Alternatively, we could add
Pygments::Socket
and talk to a single pygments service over a tcp or unix socket. The advantages of this approach are limited, however, compared to the added complexity of packaging, scaling and deployment.