Shopify / ruby-lsp

An opinionated language server for Ruby
https://shopify.github.io/ruby-lsp/
MIT License
1.59k stars 156 forks source link

Error while indexing: invalid byte sequence in UTF-8 #1312

Closed jason-o-matic closed 9 months ago

jason-o-matic commented 10 months ago

Ruby version

3.0.6

Code snippet

No response

Description

I get a notification in VSCode with this error message and "Source: Ruby LSP (Extension)", but no further information. I assume there's some file in my project that has weird characters, but the error message doesn't indicate the file with issues, so I'm not sure where to look.

I think the bug here is that this error message should be more helpful and it should reference whichever files had the encoding issue.

Expected output

An error message with more information about what exactly the problem was, which file it was in, perhaps a line number and column, and even the problematic string.

vinistock commented 10 months ago

Thank you for the bug report! It's possible that we're just surfacing the error messages coming from Prism (the parser).

If run BUNDLE_GEMFILE=.ruby-lsp/Gemfile bundle exec ruby-lsp-doctor in the command line, it will run indexing and print the files as it goes. That last file before the error should be the problematic one.

If you can, please share what the piece of problematic code is.

jason-o-matic commented 9 months ago

Here's the last file and the error message:

indexing: /Users/jason/.rbenv/versions/3.0.6/lib/ruby/gems/3.0.0/bundler/gems/color-tools-53138ca93a20/lib/color/hsl.rb
/Users/jason/.rbenv/versions/3.0.6/lib/ruby/gems/3.0.0/gems/ruby-lsp-0.13.4/lib/ruby_indexer/lib/ruby_indexer/collector.rb:307:in `match?': invalid byte sequence in UTF-8 (ArgumentError)
    from /Users/jason/.rbenv/versions/3.0.6/lib/ruby/gems/3.0.0/gems/ruby-lsp-0.13.4/lib/ruby_indexer/lib/ruby_indexer/collector.rb:307:in `block in collect_comments'
    from /Users/jason/.rbenv/versions/3.0.6/lib/ruby/gems/3.0.0/gems/ruby-lsp-0.13.4/lib/ruby_indexer/lib/ruby_indexer/collector.rb:302:in `downto'
    from /Users/jason/.rbenv/versions/3.0.6/lib/ruby/gems/3.0.0/gems/ruby-lsp-0.13.4/lib/ruby_indexer/lib/ruby_indexer/collector.rb:302:in `collect_comments'
    from /Users/jason/.rbenv/versions/3.0.6/lib/ruby/gems/3.0.0/gems/sorbet-runtime-0.5.11193/lib/types/private/methods/call_validation_2_7.rb:968:in `bind_call'
    from /Users/jason/.rbenv/versions/3.0.6/lib/ruby/gems/3.0.0/gems/sorbet-runtime-0.5.11193/lib/types/private/methods/call_validation_2_7.rb:968:in `block in create_validator_method_medium1'
    from /Users/jason/.rbenv/versions/3.0.6/lib/ruby/gems/3.0.0/gems/ruby-lsp-0.13.4/lib/ruby_indexer/lib/ruby_indexer/collector.rb:161:in `handle_def_node'
    from /Users/jason/.rbenv/versions/3.0.6/lib/ruby/gems/3.0.0/gems/sorbet-runtime-0.5.11193/lib/types/private/methods/call_validation_2_7.rb:687:in `bind_call'
    from /Users/jason/.rbenv/versions/3.0.6/lib/ruby/gems/3.0.0/gems/sorbet-runtime-0.5.11193/lib/types/private/methods/call_validation_2_7.rb:687:in `block in create_validator_procedure_fast1'
    from /Users/jason/.rbenv/versions/3.0.6/lib/ruby/gems/3.0.0/gems/ruby-lsp-0.13.4/lib/ruby_indexer/lib/ruby_indexer/collector.rb:67:in `collect'
    from /Users/jason/.rbenv/versions/3.0.6/lib/ruby/gems/3.0.0/gems/sorbet-runtime-0.5.11193/lib/types/private/methods/call_validation_2_7.rb:687:in `bind_call'
    from /Users/jason/.rbenv/versions/3.0.6/lib/ruby/gems/3.0.0/gems/sorbet-runtime-0.5.11193/lib/types/private/methods/call_validation_2_7.rb:687:in `block in create_validator_procedure_fast1'
    from /Users/jason/.rbenv/versions/3.0.6/lib/ruby/gems/3.0.0/gems/ruby-lsp-0.13.4/exe/ruby-lsp-doctor:14:in `block in <top (required)>'
    from /Users/jason/.rbenv/versions/3.0.6/lib/ruby/gems/3.0.0/gems/ruby-lsp-0.13.4/exe/ruby-lsp-doctor:9:in `each'
    from /Users/jason/.rbenv/versions/3.0.6/lib/ruby/gems/3.0.0/gems/ruby-lsp-0.13.4/exe/ruby-lsp-doctor:9:in `<top (required)>'
    from /Users/jason/.rbenv/versions/3.0.6/bin/ruby-lsp-doctor:25:in `load'
    from /Users/jason/.rbenv/versions/3.0.6/bin/ruby-lsp-doctor:25:in `<top (required)>'
    from /Users/jason/.rbenv/versions/3.0.6/lib/ruby/gems/3.0.0/gems/bundler-2.4.13/lib/bundler/cli/exec.rb:58:in `load'
    from /Users/jason/.rbenv/versions/3.0.6/lib/ruby/gems/3.0.0/gems/bundler-2.4.13/lib/bundler/cli/exec.rb:58:in `kernel_load'
    from /Users/jason/.rbenv/versions/3.0.6/lib/ruby/gems/3.0.0/gems/bundler-2.4.13/lib/bundler/cli/exec.rb:23:in `run'
    from /Users/jason/.rbenv/versions/3.0.6/lib/ruby/gems/3.0.0/gems/bundler-2.4.13/lib/bundler/cli.rb:492:in `exec'
    from /Users/jason/.rbenv/versions/3.0.6/lib/ruby/gems/3.0.0/gems/bundler-2.4.13/lib/bundler/vendor/thor/lib/thor/command.rb:27:in `run'
    from /Users/jason/.rbenv/versions/3.0.6/lib/ruby/gems/3.0.0/gems/bundler-2.4.13/lib/bundler/vendor/thor/lib/thor/invocation.rb:127:in `invoke_command'
    from /Users/jason/.rbenv/versions/3.0.6/lib/ruby/gems/3.0.0/gems/bundler-2.4.13/lib/bundler/vendor/thor/lib/thor.rb:392:in `dispatch'
    from /Users/jason/.rbenv/versions/3.0.6/lib/ruby/gems/3.0.0/gems/bundler-2.4.13/lib/bundler/cli.rb:34:in `dispatch'
    from /Users/jason/.rbenv/versions/3.0.6/lib/ruby/gems/3.0.0/gems/bundler-2.4.13/lib/bundler/vendor/thor/lib/thor/base.rb:485:in `start'
    from /Users/jason/.rbenv/versions/3.0.6/lib/ruby/gems/3.0.0/gems/bundler-2.4.13/lib/bundler/cli.rb:28:in `start'
    from /Users/jason/.rbenv/versions/3.0.6/lib/ruby/gems/3.0.0/gems/bundler-2.4.13/exe/bundle:45:in `block in <top (required)>'
    from /Users/jason/.rbenv/versions/3.0.6/lib/ruby/gems/3.0.0/gems/bundler-2.4.13/lib/bundler/friendly_errors.rb:117:in `with_friendly_errors'
    from /Users/jason/.rbenv/versions/3.0.6/lib/ruby/gems/3.0.0/gems/bundler-2.4.13/exe/bundle:33:in `<top (required)>'
    from bin/bundle:3:in `load'
    from bin/bundle:3:in `<main>'

And here's the contents of that file:

#--
# Colour management with Ruby.
#
# Copyright 2005 Austin Ziegler
#   http://rubyforge.org/ruby-pdf/
#
#   Licensed under a MIT-style licence.
#
# $Id$
#++

# An HSL colour object. Internally, the hue (#h), saturation (#s), and
# luminosity (#l) values are dealt with as fractional values in the range
# 0..1.
class Color::HSL
  class << self
    # Creates an HSL colour object from fractional values 0..1.
    def from_fraction(h = 0.0, s = 0.0, l = 0.0)
      colour = Color::HSL.new
      colour.h = h
      colour.s = s
      colour.l = l
      colour
    end
  end

  # Compares the other colour to this one. The other colour will be
  # converted to HSL before comparison, so the comparison between a HSL
  # colour and a non-HSL colour will be approximate and based on the other
  # colour's #to_hsl conversion. If there is no #to_hsl conversion, this
  # will raise an exception. This will report that two HSL values are
  # equivalent if all component values are within 1e-4 (0.0001) of each
  # other.
  def ==(other)
    other = other.to_hsl
    other.kind_of?(Color::HSL) and
    ((@h - other.h).abs <= 1e-4) and
    ((@s - other.s).abs <= 1e-4) and
    ((@l - other.l).abs <= 1e-4)
  end

  # Creates an HSL colour object from the standard values of degrees and
  # percentages (e.g., 145∫, 30%, 50%).
  def initialize(h = 0, s = 0, l = 0)
    @h = h / 360.0
    @s = s / 100.0
    @l = l / 100.0
  end

  # Present the colour as an HTML/CSS colour string.
  def html
    to_rgb.html
  end

  # Converting to HSL as adapted from Foley and Van-Dam from
  # http://www.bobpowell.net/RGBHSB.htm.
  def to_rgb(ignored = nil)
    # If luminosity is zero, the colour is always black.
    return Color::RGB.new if @l == 0
    # If luminosity is one, the colour is always white.
    return Color::RGB.new(0xff, 0xff, 0xff) if @l == 1
    # If saturation is zero, the colour is always a greyscale colour.
    return Color::RGB.new(@l, @l, @l) if @s <= 1e-5

    if (@l - 0.5) < 1e-5
      tmp2 = @l * (1.0 + @s.to_f)
    else
      tmp2 = @l + @s - (@l * @s.to_f)
    end
    tmp1 = 2.0 * @l - tmp2

    t3  = [ @h + 1.0 / 3.0, @h, @h - 1.0 / 3.0 ]
    t3 = t3.map { |tmp3|
      tmp3 += 1.0 if tmp3 < 1e-5
      tmp3 -= 1.0 if (tmp3 - 1.0) > 1e-5
      tmp3
    }

    rgb = t3.map do |tmp3|
      if ((6.0 * tmp3) - 1.0) < 1e-5
        tmp1 + ((tmp2 - tmp1) * tmp3 * 6.0)
      elsif ((2.0 * tmp3) - 1.0) < 1e-5
        tmp2
      elsif ((3.0 * tmp3) - 2.0) < 1e-5
        tmp1 + (tmp2 - tmp1) * ((2 / 3.0) - tmp3) * 6.0
      else
        tmp1
      end
    end

    Color::RGB.from_fraction(*rgb)
  end

  # Converts to RGB then YIQ.
  def to_yiq
    to_rgb.to_yiq
  end

  # Converts to RGB then CMYK.
  def to_cmyk
    to_rgb.to_cmyk
  end

  # Returns the luminosity (#l) of the colour.
  def brightness
    @l
  end

  def to_greyscale
    Color::GrayScale.from_fraction(@l)
  end

  alias to_grayscale to_greyscale

  attr_reader :h, :s, :l

  def h=(hh) #:nodoc:
    hh = 1.0 if hh > 1
    hh = 0.0 if hh < 0
    @h = hh
  end

  def s=(ss) #:nodoc:
    ss = 1.0 if ss > 1
    ss = 0.0 if ss < 0
    @s = ss
  end

  def l=(ll) #:nodoc:
    ll = 1.0 if ll > 1
    ll = 0.0 if ll < 0
    @l = ll
  end
end

It looks to me like the issue is the character right after percentages (e.g., 145.

andyw8 commented 9 months ago

Thanks for the report. It may be related to the file encoding:

% file -I lib/color/hsl.rb  
lib/color/hsl.rb: text/x-ruby; charset=iso-8859-1

I'll continue to investigate, but I'm curious, where is this dependency coming from in your Gemfile.lock? It's a very old library (last release was 2005).

andyw8 commented 9 months ago

~And that is an invalid character for that encoding: https://en.wikipedia.org/wiki/ISO/IEC_8859-1~

Edit: The above may be wrong:

irb(main):002> File.read("lib/color/hsl.rb").force_encoding("ISO-8859-1").valid_encoding?
=> true
irb(main):003> File.read("lib/color/hsl.rb").force_encoding("UTF-8").valid_encoding?
=> false
andyw8 commented 9 months ago

(I think the author actually intended that to appear as a degree symbol, as used in HSB color encoding, but in 8859-1 encoding it actually maps to the very similar the Ordinal indicator)

andyw8 commented 9 months ago

Note that if we specify the encoding then it doesn't raise:

irb(main):007> File.read("lib/color/hsl.rb").match?(/foo/)
(irb):7:in `match?': invalid byte sequence in UTF-8 (ArgumentError)
irb(main):008> File.read("lib/color/hsl.rb",  encoding: "iso-8859-1").match?(/foo/)
=> false