Closed mbj closed 8 years ago
One thing which has come out of this issue: ruby/ruby@dfca38e
An aside: getting up close and personal with Ruby like this makes it less and less interesting to me as a language for general development. For little scripts which process text files and such (AWK/Perl replacement), great. For little Rack apps which serve up 1 or 2 dynamic pages, great. For bigger projects, not great. Too messy.
An aside: getting up close and personal with Ruby like this makes it less and less interesting to me as a language for general development.
Same here.
For little Rack apps which serve up 1 or 2 dynamic pages, great. For bigger projects, not great. Too messy.
Same feelings, but:
There is a market from clients that already have "Big" ruby projects and cannot switch away from the language fast. As we know the "big rewrite" always fails. Most of my commercial time is spend on such clients, first fixing their Ruby to be least painful to manage on a big scale (which IMO breaks a lot with typical ruby mantras) and than in need (AKA when ROI is close enough time wise for business targets) slowly migrate away.
Sorry for hijacking this thread with that post.
@whitequark says that "Tooling does not want to deal with... ASCII-8BIT", but if the above paragraph is true, there is no reason why tooling doesn't want ASCII-8BIT. Tooling doesn't want BINARY, sure.
Hence I think tooling should not have to choose at all, and parser should simply raise its own exception, because it does not support this case. This is most easy for tooling, saying "no" to one in 10k files explicitly is better than failing with a random exception, or worse having a switch that only affects this specific file.
Subsets are fine, supersets (bugs) not.
I decided to reject such literals by default. If downstream tooling actually wants to handle them, it can opt-in by using a custom AST builder.
I decided to reject such literals by default. If downstream tooling actually wants to handle them, it can opt-in by using a custom AST builder.
+1
Tooling does not want to deal with non-ASCII-compatible (US-ASCII-compatible in Ruby parlance, not ASCII-8BIT which is an extension of ASCII) encodings, so we do not emit that.
I saw this misconception a lot in the thread and wanted to comment, even though it's about 2 years old by now.
In Ruby, ASCII-8BIT
actually means "I have no idea, but it's probably text". You cannot do anything useful with it other than declare what encoding it actually is in and hope that's right.
I really hate that name for it. It confuses everyone.
Huh, TIL. Thanks.
The following file (reduced from: https://github.com/ruby/spec/blob/master/core/symbol/casecmp_spec.rb) crashes parser:
Backtrace:
Ruby 2.3.0-p0 accepts it and prints: