jlong / radius

A small, but powerful tag-based template language for Ruby modeled after the ones used in MovableType and TextPattern. It has tags similar to XML, but can be used to generate any form of plain text (HTML, e-mail, etc...).
MIT License
101 stars 21 forks source link

The new Ragel parser is VERY slow #9

Closed jfahrenkrug closed 13 years ago

jfahrenkrug commented 13 years ago

I've updated my performance project to add a ruby-prof test to it and the ruby-prof output for 0.8.1 and 0.9.1 as html files: https://github.com/jfahrenkrug/Radiant-Performance-Decline

It clearly shows that the new Ragel parser in Raduis 0.6.1 is horribly slow: it looks at every single byte.

See this ticket: https://github.com/radiant/radiant/issues/247

jfahrenkrug commented 13 years ago

I've written an isolated test that uses the Radius Ragel state machine on 460k of the letter 'a'. It takes about 9 seconds on my 2007 MBP: https://gist.github.com/1084817

Run it with Ragel like so:

ragel -R radius_test.rl && ruby radius_test.rb

Honestly, looking at the crazy amount of ugly code that Ragel generates, I think the best option is to revert back to normal Regexes to find the radius tags. It seemed to work fine before, I don't know why the switch to Ragel was even necessary. I highly doubt that it's even possible to bring the crazy Ragel auto-generated code to perform at a decent speed at all.

jfahrenkrug commented 13 years ago

Ok, I've found something else, a presentation about Ragel by Ryan Dahl: http://www.slideshare.net/adorepump/ragel-talk

He's saying that Ragel-generated Ruby code is SLOW. He's right :) He suggest to build a Ruby C-extension for Ragel stuff. That sounds like a great idea!

jocubeit commented 13 years ago

Don't you love the silence?

Papipo commented 13 years ago

Why don't you use temple to compile to ERB?

jfahrenkrug commented 13 years ago

@MeetDom: yes :) but it's free and opensource, so I won't complain.

@Papipo: what's temple and how would it help with radius tags?

Papipo commented 13 years ago

Temple: https://github.com/judofyr/temple

I have a proof of concept template engine using it: https://github.com/Papipo/tongo

bkerley commented 13 years ago

He's saying that Ragel-generated Ruby code is SLOW. He's right :) He suggest to build a Ruby C-extension for Ragel stuff. That sounds like a great idea!

The grammar is there, but we (including everyone in this thread) should probably refactor it to allow for pure-Ruby, Java-backed, and C-backed scanners.

As it is right now, I'm no longer paid to work on Radius, but I'm completely willing to review code for it.

saturnflyer commented 13 years ago

I appreciate all the feedback here. I'm completely new to Ragel, so I'm leaning on direction from others. And I've been wondering about alternative implementations. Not sure of @jlong's opinion

jlong commented 13 years ago

I'll defer to @bkerley. I'm not familiar enough with Ragel to weigh in here. Honestly Radius is in need of a superhero, so if one of you wants to jump in and take the lead, be my guest.

jfahrenkrug commented 13 years ago

Just a short status update: Thomas Schönemann and I have been able to port the core parser to C, but it's still very much a work in progress. A lot remains to be done and tested and then the whole thing has to be "wrapped" in a Ruby C extension. But progress is being made :)

ehaselwanter commented 13 years ago

do u see speed improvements? is it just the scanner which is slow?

jlong commented 13 years ago

@jfahrenkrug that's really cool. Let me give you access to the repo and you can push to a branch.

jlong commented 13 years ago

@jfahrenkrug You have access now. Can you put this in a branch?

jgarber commented 13 years ago

Great job @jfahenkrug! I don't mean to discourage you, but if I can weigh in: I've been down the Ragel C road with RedCloth and it sucked. That's why I'm rewriting RedCloth in Parslet. Tying yourself to C locks out a number of surprisingly loud people. ;-) Thankfully, Ola Bini did all the work of writing the Java code. Still, I get all sorts of bug reports just related to compiling and platform.

If you do stick with the C extension, make it a separate gem so that different Radius gems can be dependencies depending on the platform. I think that's possible.

Parslet is really pretty fast and pleasant to work in. You should check it out!

jlong commented 13 years ago

Parselet looks really nice!

bkerley commented 13 years ago

Make it like the relationship between the bson and bson-ext gems :)

On Jul 28, 2011, at 10:50, jgarber wrote:

Great job @jfahenkrug! I don't mean to discourage you, but if I can weigh in: I've been down the Ragel C road with RedCloth and it sucked. That's why I'm rewriting RedCloth in Parslet. Tying yourself to C locks out a number of surprisingly loud people. ;-) Thankfully, Ola Bini did all the work of writing the Java code. Still, I get all sorts of bug reports.

If you do stick with the C extension, make it a separate gem so that different Radius gems can be dependencies depending on the platform. I think that's possible.

Parslet is really pretty fast and pleasant to work in. You should check it out!

Reply to this email directly or view it on GitHub: https://github.com/jlong/radius/issues/9#issuecomment-1674287

jfahrenkrug commented 13 years ago

Parslet does look nice, but I'm not sure if it will solve our problem:

> + Any performance benchmarks?

Official ones, no. Working with it: I just might have gotten the
impression that there is room for improvement. (slow!) This is certainly
the next thing on the menu. For example, it uses exceptions heavily
which isn't the fastest thing to do. Can be and will be changed.

Source: http://www.ruby-forum.com/topic/760037

Does anyone have any benchmarks?

jfahrenkrug commented 13 years ago

What about treetop? http://treetop.rubyforge.org/

jfahrenkrug commented 13 years ago

Ok, parslet is not an option. This super simple parser took over 15 seconds to parse 46.000 'a's:

require 'rubygems'
require 'parslet' 

class Mini < Parslet::Parser
  rule(:integer) { match('[a-z]').repeat(1) }
  root(:integer)
end

t = Time.now
p Mini.new.parse('a' * 46000)
t2 = Time.now - t
puts t2

As a comparison, it takes the current ragel parser about 6-9 seconds to parse ten times that much.

jfahrenkrug commented 13 years ago

Citrus takes about 7 seconds for 460k of "a"s. So much better than parslet, but not better than the current ragel ruby parser.

jfahrenkrug commented 13 years ago

Treetop seems to be the best of the bunch: 3.2 seconds for 460k of "a"s. Those tests are not representative, though. These tests were a lot simpler than the current radius parser.

jfahrenkrug commented 13 years ago

Before I continue doing any work on the Ragel C port: Why is Radius using Ragel is the first place? Why didn't it stick to using the fast Regex parser of the 1.5 series?

bkerley commented 13 years ago

The regex parser was O(2n) for the number of attributes on a tag, which meant that one tag the application I worked on at the time would take several months to parse.

On Jul 29, 2011, at 6:26, jfahrenkrug wrote:

Before I continue doing any work on the Ragel C port: Why is Radius using Ragel is the first place? Why didn't it stick to using the fast Regex parser of the 1.5 series?

Reply to this email directly or view it on GitHub: https://github.com/jlong/radius/issues/9#issuecomment-1680788

jfahrenkrug commented 13 years ago

@bkerley are you serious?

jfahrenkrug commented 13 years ago

Ok, I've dropped the Ragel to C path. I've converted the core parser back to regex:

https://github.com/jlong/radius/commit/33470a02daf2a0e630aa41f30fc351f02b9324ec and https://github.com/jlong/radius/commit/a16c473370f8bc23f1a6316a74f3a05b6d5a9bb2

All the tests pass and it's WAAAYYY faster. I have not touched the JavaScanner, though. IMHO Ragel is overkill to find some simple tags in a piece of text. Please let me know what you think.

jocubeit commented 13 years ago

@jfahrenkrug you were echoing my thoughts. Great work. I'll give it a test run.

jlong commented 13 years ago

@bkerley Did you have tests in there that replicate the performance problems you were having?

John Long http://wiseheartdesign.com

bkerley commented 13 years ago

I do, and it seems to pass just fine. A big win!

On Jul 29, 2011, at 14:44, jlongreply@reply.github.com wrote:

@bkerley Did you have tests in there that replicate the performance problems you were having?

John Long http://wiseheartdesign.com

Reply to this email directly or view it on GitHub: https://github.com/jlong/radius/issues/9#issuecomment-1684864

saturnflyer commented 13 years ago

@jfahrenkrug thank you! is this complete and ready to be merged?

jfahrenkrug commented 13 years ago

Yes, it is :)

On 02.08.2011, at 06:16, saturnflyerreply@reply.github.com wrote:

@jfahrenkrug thank you! is this complete and ready to be merged?

Reply to this email directly or view it on GitHub: https://github.com/jlong/radius/issues/9#issuecomment-1706430

jfahrenkrug commented 13 years ago

@saturnflyer Should I merge it or do you or @jlong want to do it?

saturnflyer commented 13 years ago

done! Thanks everyone for your input, especially @jfahrenkrug