robinst / autolink-java

Java library to extract links (URLs, email addresses) from plain text; fast, small and smart
MIT License
207 stars 40 forks source link

Adapt autolink-java to replace rinku in JRuby #20

Open headius opened 6 years ago

headius commented 6 years ago

Hello! I am working on getting the Discourse app to run in JRuby, and need to replace its dependency on rinku.

There are two ways we typically do this:

The latter would be preferable, since all we'd need to write is a bit of Ruby to wrap your library.

However there's a few things that would make this integrate better with JRuby:

Here's a quick and dirty rinku-like wrapper based on your example code from README. It can serve as a place to start discussing: https://github.com/headius/jruby-autolink

Discourse on JRuby work: https://meta.discourse.org/t/getting-discourse-running-on-jruby/81273/14 Issue to make a JRuby port of rinku: https://github.com/vmg/rinku/issues/75

robinst commented 6 years ago

Hey Charles! Thanks for getting in contact. I've used JRuby myself before, so I'm eager to help make it work better there if it makes sense.

CharSequence is great, but the API produces String eventually. This means JRuby's byte[]-based Ruby strings need an extra conversion step, which will obviously slow down the rendering of a large document.

Can you clarify what this means? Would you want to have another API that works on UTF-8 byte[] instead?

Compatibility with rinku. I'm not sure how to map the features of rinku to autolink-java and will need some tips here.

Yeah. To be clear, this library started as a port but the logic has since been tweaked and does not match rinku's logic 100%. Some other important differences: