Closed joshleblanc closed 5 years ago
@HorizonShadow we have had this reported before in #4546. It very well may be from a totally different cause (if you read down this issue someone had a random comment about java in a comment at the top of a ruby file and it had coding=UTF8 on that line so ruby interprets that as a pragma.
UTF8 is not actually a valid encoding string for Ruby but it seems that Java will accept it. If you could look around and figure how this is coming into the picture (-Dfile.encoding=UTF8 or Ruby comment). In the issue above we debated bending Ruby compat. to allow this string but before reconsidering that it would be nice to see how it is entering the system.
There's nothing containing "UTF" anywhere in my project, however that comment you're talking about is littered throughout various libraries that are included. Mainly the tzinfo-data library.
@HorizonShadow oh yeah I guess I did not mean exactly your code but any code you used. You mean "UTF8" versus "UTF-8" right? I just perused tzinfo-data gem current source and only see UTF-8 although that is current HEAD.
Hm, guess it was "UTF-8" I was seeing.
I checked again for UTF8, and didn't find anything of note in my project, or dependencies.
@HorizonShadow another possible source is maybe LANG environment variable. Or somehow if Java is invoked somehow with -Dfile.encoding=UTF8.
To detect the latter we can look for file.encoding if you add these two lines right before where it crashes:
require 'pp'
pp java.lang.System.properties.to_hash'
If you are ok posting that then I can see if anything beside 'file.encoding' looks off.
My motivation for trying to understand the source of this is if I see this as something more widespread than what happened in #4546 then I may end up trying to get MRI to allow "UTF8" as a valid encoding name. We like to stay as compatible with Ruby as we can, so my desire is to not add non-standard behavior. It is frustrating that Java accepts UTF8 as a string as I do not think it is a valid encoding name by unicode standards.
Having the same problem so I'll pick up the baton.
The output from pp java.lang.System.properties.to_hash
is at
https://github.com/rcrews/chapter2/blob/master/issue23.md
You can download and try the code yourself at
https://github.com/rcrews/chapter2
I'm running the RVM environment with JRuby 9.2.5.0 and gemset "chapter2".
To see the UTF8 error, run the following:
bundle exec rackup
The specific location of the failure is when creating the first SQLite database table at https://github.com/rcrews/chapter2/blob/master/lib/chapter2.rb#L16
@rcrews thanks for the repro...after going through the bowels of do/dm I see: https://github.com/datamapper/do/blob/master/do_jdbc/src/main/java/data_objects/drivers/AbstractDriverDefinition.java#L414
You can see it is asking jcodings for an encoding we have never supported (e.g. this issue) so I don't think this ever actually worked. Looking at blame on that line it was added in 2011 circa 1.9.2 where people were starting to be confident in 1.9 and m17n.
So the question of how in original report is finally answered but what to do with it? It is code which could have never worked on JRuby. It is not out of the question that we add it as an alias but it is making a mild Ruby incompatibility. @lopex what do you think about pains of adding UTF8? I have probably spent about 45 minutes figuring this out and Java can pass this over the wall to us conceibably. Adding an alias takes a couple of minutes but I don't know what weirdness may occur from any reflective methods with Ruby Encoding...
@enebo yeah we can do that, now we dont use reflection for known encoding loads anyways.
@lopex we encounter the same issue. Can you report any status regarding this fix. Thanks in advance for your effort.
Apparently, changes that were responsible for switch generation (to avoid reflection) broke that load method. We should deprecate this API actually.
jcodings-1.0.44 is released
I think this is the right place to report this.
I'm getting an error in jruby that appears to be coming from here. I'm at least moderately sure the UTF8 encoding class exists.
I'm using jruby 9.2.
The stack trace for the jruby bit is:
Calling code: https://github.com/datamapper/dm-do-adapter/blob/master/lib/dm-do-adapter/adapter.rb#L304
I'm not sure what more information I can provide, I don't really know how jruby works.