Anyolite / anyolite

Embedded mruby/Ruby for Crystal
https://anyolite.github.io/anyolite
MIT License
162 stars 10 forks source link

Using Anyolite breaks regexes on UTF-8 strings #17

Closed willhbr closed 2 years ago

willhbr commented 2 years ago

If you require "anyolite" and create an RbInterpreter, Crystal regexes stop working. I assume this is because mruby links in a version that has been compiled without UTF-8 support, and Crystal strings are UTF-8.

This example program:

require "anyolite"
puts "foobar".match /f/
Anyolite::RbInterpreter.create do |rb|
end

Fails with:

Unhandled exception: this version of PCRE is compiled without UTF support at 0 (ArgumentError)
  from /usr/share/crystal/src/regex.cr:262:5 in 'initialize'
  from /usr/share/crystal/src/regex.cr:256:3 in 'new'
  from test.cr:3:21 in '~$Regex:0:init'
  from /usr/share/crystal/src/crystal/once.cr:25:54 in 'once'
  from /usr/share/crystal/src/crystal/once.cr:50:3 in '__crystal_once'
  from ?? in '~$Regex:0:read'
  from test.cr:3:21 in '__crystal_main'
  from /usr/share/crystal/src/crystal/main.cr:115:5 in 'main_user_code'
  from /usr/share/crystal/src/crystal/main.cr:101:7 in 'main'
  from /usr/share/crystal/src/crystal/main.cr:127:3 in 'main'
  from /lib/x86_64-linux-gnu/libc.so.6 in '??'
  from /lib/x86_64-linux-gnu/libc.so.6 in '__libc_start_main'
  from /home/will/.cache/crystal/crystal-run-test.tmp in '_start'
  from ???

Is it possible to link in PCRE with UTF support?

Thanks :)

$ crystal --version; uname -a
Crystal 1.5.0 [994c70b10] (2022-07-06)

LLVM: 10.0.0
Default target: x86_64-unknown-linux-gnu
Linux jared 5.15.0-43-generic #46-Ubuntu SMP Tue Jul 12 10:30:17 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Hadeweka commented 2 years ago

That is an interesting bug, but it might explain some issues I had with regular expressions on the mruby side at one point.

I will look into this problem, thank you for reporting it!

Hadeweka commented 2 years ago

Temporary workaround:

I added a no-regex branch to Anyolite, which excludes the problematic Regex gem from the mruby configuration (passing a custom mruby configuration file with the respective line removed does the same).

Your code snippet works perfectly fine with that branch. However, you can obviously only use Regexes in Crystal then, not in mruby.

The gem is relatively unmaintained, so it might be best practice to remove it from Anyolite anyway (at least until I find a better replacement). I will keep this issue open until there's a way to have Regexes in both languages.

existXFx commented 2 years ago
  1. Remove line conf.gem :mgem => 'regexp-pcre' from file lib/anyolite/utility/mruby_build_config.rb
  2. Remove anyolite build directory
  3. run rake build_shard

Should also works.

willhbr commented 2 years ago

Perfect, thanks! I've checked and this works on my project.

Hadeweka commented 2 years ago

I've now also removed the regex gem from the main branch, since it doesn't work at all with Crystal 1.5.1.

I will close this issue for now, but I will try to find a replacement for the gem.

Hadeweka commented 2 years ago

Update: mruby has Regex support again.

I really don't know how I didn't just wrap the Crystal Regex class before... 😄