danielpclark / rutie

“The Tie Between Ruby and Rust.”
MIT License
939 stars 62 forks source link

ArgumentError: unknown encoding name: ascii-8bit #94

Open anna-is-cute opened 5 years ago

anna-is-cute commented 5 years ago

I have a bizarre issue and was hoping you might be able to shed some light on it. I have a project that I'd like to use the rouge syntax highlighter with, so I'm using rutie to interop with ruby.

In all my tests before integrating the code into my main project, rutie + rouge was working fine. I VM::init() and VM::init_loadpath() and then eval a small ruby file I wrote and use the class inside of it to interact with rouge, and that all works.

However, when I moved this code into my main project, a most peculiar error occurs: the title of this issue. So, in my main project, I added the below to the very start of main().

rutie::VM::init();
rutie::VM::init_loadpath();
eprintln!("{:#?}", rutie::Encoding::find("utf-8"));
eprintln!("{:#?}", rutie::Encoding::find("ascii-8bit"));
eprintln!("{:#?}", rutie::Encoding::find("us-ascii"));

Surprisingly, utf-8 and us-ascii both exist, but ascii-8bit does not.

This appears to be an issue specifically with the way my binary is compiled, as different ruby versions and different environments (this is being run in a docker container, but building running the binary on my actual computer results in the same) produce the same error. When I run that snippet above in a plain project with just a rutie dependency and no other code, it works fine.

Note that running the binary in the docker container will exhibit the error, but I can also start irb in the same container and verify that the ascii-8bit encoding works fine, so I'm pretty baffled.

Is there something that could be happening during compilation or linking that could be screwing this up? That's all I can think at this point, considering it's just this binary that is afflicted (and across platforms). Maybe it's something completely different! Hopefully you have some input I haven't thought of.

danielpclark commented 5 years ago

Have you tried:

VM::require("enc/encdb");
VM::require("enc/trans/transdb");

after VM::init_loadpath ?

The example of these are show in RString.codepoints and RString::from_bytes .

anna-is-cute commented 5 years ago

I have not, but I will do that tonight! It strikes me as odd that everything works as expected in a simple project but in a more complex project, the encoding goes missing. Either way, I'll post again with results when I test it tonight.

danielpclark commented 5 years ago

Cool. I have found that each operating system has a few minimum codecs available by default, depending on the OS, and to access the rest of them you need to do those additional requires.

anna-is-cute commented 5 years ago

Okay, I didn't get around to testing this until now.

backend_1   | /usr/lib/ruby/2.5.0/rubygems.rb:9: warning: failed to load encoding (ascii-8bit); use ASCII-8BIT instead
backend_1   | thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Ok(#<NoMethodError: undefined method `empty?' for nil:NilClass>)', src/libcore/result.rs:999:5

Assuming the NilClass here is the encoding it didn't find. All my code works as intended elsewhere, so it's not me.

So that's interesting. I was also getting stack level too deep, which was fun, but even if I got rid of that, I'd end up with the error above. So apparently it still can't find the encoding, although now it doesn't crash trying to find it. I'm not sure how it can not exist, to be honest.

danielpclark commented 5 years ago

If you read the warning you just posted it says “failed to load encoding (ascii-8bit); use ASCII-8BIT instead”. So I think uppercase is what you should try.

anna-is-cute commented 5 years ago

They both work; the encoding is missing. That error comes from rubygems.rb, so I have no control over this.

-------- Original Message -------- On Jun 24, 2019, 1:10 AM, Daniel P. Clark wrote:

If you read the warning you just posted it says failed to load encoding (ascii-8bit); use ASCII-8BIT instead. So I think uppercase is what you should try.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

danielpclark commented 5 years ago

Looking into it https://idiosyncratic-ruby.com/56-us-ascii-8bit.html if it's simply just bytes you can use the methods RString.to_bytes_unchecked and before doing that you can grab the current encoding and afterwards convert it back with RString::from_bytes.

If you want to work with normal encodings with multi-bytes then RString.codepoints is what you want.

anna-is-cute commented 5 years ago

If it was me, I may do that, but this error is coming from the actual rubygems gem. The code of that gem is not under my control. The issue is not the code, though: again, the code works perfectly fine in a minified crate. Once I try to use my helper crate inside of another, bigger crate, something goes wrong and it can't find the encoding.

danielpclark commented 5 years ago

:thinking: hmmm, I wonder if it's related to the Rust C linker regression… https://github.com/rust-lang/cargo/issues/4044 ? Setting the environment variable LD_LIBRARY_PATH on Linux and DYLD_LIBRARY_PATH on Mac to the output of ruby -e "puts RbConfig::CONFIG['libdir']" gets past that particular issue and may help. :man_shrugging: You can see my comment towards the end of that issue.

anna-is-cute commented 5 years ago

That would be interesting, considering it appears to find libruby.so fine, but I'll try it nonetheless. Will update with results soon.

danielpclark commented 5 years ago

I just had an idea... Try:

VM::require("enc/ascii");

as well. According to the directory structures of the other requires used that exists: https://github.com/ruby/ruby/tree/master/enc

anna-is-cute commented 5 years ago

Hah! I'll feel quite silly if that works.

anna-is-cute commented 5 years ago

Unfortunately, neither change fixed the issue. LoadError: cannot load such file -- enc/ascii is the issue with requiring enc/ascii, so that's concerning.

danielpclark commented 5 years ago

Yeah. The libruby path issue is more of a Rust application side of things and not a Ruby gems thing. Also the earlier requires enc/encdb and enc/trans/transdb both have Init_… methods which can be required. enc/ascii doesn't have that.

I think this will take more time then I have today to look in to. Looking at the ascii.c file it defines one type and the rest is just uses of C macros alias and replicate. I may need to look in to that.

danielpclark commented 5 years ago

I did find the relevant area for both the enc/encdb require and the encoding ascii-8bit you're interested in here: https://github.com/ruby/ruby/blob/master/template/encdb.h.tmpl It's a header template which is used to generate what's available in the language when Ruby is compiled. Maybe it will be useful to you?

anna-is-cute commented 5 years ago

Yeah, I'm doing the following:

      VM::init();
      VM::init_loadpath();
      VM::require("enc/encdb");
      VM::require("enc/trans/transdb");
      VM::require("enc/ascii");
      VM::eval(HIGHLIGHT_RB).unwrap();

I'll have to look into that link. I've tried all that I can think of to get the encoding to exist, but I remain so baffled as to why this issue only pops up in a larger project where the same code works by itself. I thought maybe there was some ruby cross-contamination in other crates, but I can't find any reference to other ruby crates in Cargo.lock except for my own usage.

Thanks for looking into this, though. Hopefully we can nab the root cause.

danielpclark commented 5 years ago

You can remove the VM::require("enc/ascii");. If it were allowed to be required the ascii.c file would have the following code in it.

void
Init_ascii(void)
{
#include "for_example_some_C_header_file_here_for_ascii"
}
danielpclark commented 5 years ago

Just curious. In the first link I don't see where you include anything from rouge like use rouge;

Also why do you need ascii-8bit at all? That's typically used for handling bytes in Ruby but I don't see that particular use in the highlights.rb file. The paste project doesn't use it and rouge only checks for it as it's a subset of UTF8 but doesn't use it specifically.

Also since your crate is digging a file out from another project to eval you could wrap that in any Ruby code you wanted in the crate for changes, refinements, monkey-patches, or encoding changes beforehand.

Basically this is me taking a long time to ask why this encoding? And where, if anywhere, is it used?

anna-is-cute commented 5 years ago

In the first link I don't see where you include anything from rouge like use rouge;

The first link is literally a link to rouge::Rouge::new().

The second link evals this file.

Basically this is me taking a long time to ask why this encoding? And where, if anywhere, is it used?

Again, I'm not using the encoding. None of my code does. I need to require 'rouge', and to do that, I need to require 'rubygems'. rubygems uses the encoding.

danielpclark commented 5 years ago

The first link is literally a link to rouge::Rouge::new().

Are you saying that Rust let's you use a crate without calling use?

anna-is-cute commented 5 years ago

Yes.

danielpclark commented 5 years ago

I need to require 'rouge', and to do that, I need to require 'rubygems'.

I just git cloned rouge and ran bundle install. The Gemfile.lock doesn't have a rubygems dependency. Why do you think you need rubygems?

anna-is-cute commented 5 years ago

Because the rubygems gem overrides require. If I gem install rouge and try to require 'rouge' using rutie, it will fail. If I require 'rubygems', I can require 'rouge'. This really isn't the issue at hand, though.

danielpclark commented 5 years ago

It's helpful info to me nonetheless. I hadn't considered possible dependency management situations for this project.

danielpclark commented 5 years ago

Just an idea. If you need the custom require from rubygems you can load it yourself (copy & paste into your crate): https://github.com/rubygems/rubygems/blob/master/lib/rubygems/core_ext/kernel_require.rb

And if you can't change the source code in the main repository for requiring highlight.rb then when you include it in your crate simply remove the first line (from the String in Rust) which has the require 'rubygems' before you perform the eval.

anna-is-cute commented 5 years ago

Would that not require me to copy and paste a whole bunch of things like the Gem class, the activation monitor, etc.? I don't really think that will work as a solution. I'm fine without a workaround right now; I want to get at the cause of why the encoding is going missing.