d-unsed / ruru

Native Ruby extensions written in Rust
MIT License
832 stars 40 forks source link

Why is the `RString` API using Rust's `String`/`str` instead of `Vec<u8>`/`[u8]`? #97

Open CodesInChaos opened 6 years ago

CodesInChaos commented 6 years ago

My understanding is that Ruby's strings are sequences of arbitrary bytes, even if the associated encoding is UTF-8. So the natural mapping to Rust would be as [u8] and Vec<u8> instead of str and String for most purposes. A couple of helpers using str should be fine (e.g. the from_utf8 function).

Functions which convert a ruby string to &str or String without verifying UTF-8 validity must be marked unsafe, otherwise they're unsound.

danielpclark commented 6 years ago

Thanks for opening this issue. I'm going to be investigating encoding support through ruru this month so this info may come in to play. I'll be sure to provide any useful information I find on this then here.

I've been waiting for @d-unseductable to reappear on the scene of his project here and hoping to get this project moving forward. I don't know how long I'm willing to wait but I'm pretty close to making an official fork of this project (although I would be much happier as a co-maintainer of this project after discussing with Dmitry what his vision for the future of ruru is) and re-implement a few key components to be safer such as you've suggested. One such change is that new class instantiation should return a Result<AnyObject, AnyException> rather than AnyObject as Ruby is fully capable of raising exceptions in class instantiation. I've opened an issue for discussion on that specific point here: https://github.com/d-unseductable/ruru/issues/91

But my hope is really to see this project thrive and do well so I'm still holding out hoping Dmitry shows up to revive this project.

It looks like Helix just took this issue on for their project here: https://github.com/tildeio/helix/commit/faaa6b1b263bb3a388169f459042cd81a3937f58