postrank-labs / postrank-uri

URI normalization, c14n, escaping, and extraction
MIT License
301 stars 52 forks source link

Tests fail under Ruby 3.0.0+ #45

Closed ryanfb closed 1 year ago

ryanfb commented 3 years ago

Example output for bundle exec rake on Ruby 3.0.2:

/Users/ryan/.rbenv/versions/3.0.2/bin/ruby -I/Users/ryan/.rbenv/versions/3.0.2/lib/ruby/gems/3.0.0/gems/rspec-core-3.10.1/lib:/Users/ryan/.rbenv/versions/3.0.2/lib/ruby/gems/3.0.0/gems/rspec-support-3.10.2/lib /Users/ryan/.rbenv/versions/3.0.2/lib/ruby/gems/3.0.0/gems/rspec-core-3.10.1/exe/rspec --pattern spec/\*\*\{,/\*/\*\*\}/\*_spec.rb
.................................FFFFFFFFF....FFFFFFFFFFFF.F.F.....FFFF

Failures:

  1) PostRank::URI extract extracts twitter links with hashbangs
     Failure/Error: if PublicSuffix.valid?(domain, default_rule: nil)

     ArgumentError:
       wrong number of arguments (given 2, expected 1)
     # ./lib/postrank-uri.rb:102:in `block in extract'
     # ./lib/postrank-uri.rb:100:in `scan'
     # ./lib/postrank-uri.rb:100:in `extract'
     # ./spec/postrank-uri_spec.rb:241:in `e'
     # ./spec/postrank-uri_spec.rb:255:in `block (3 levels) in <top (required)>'

  2) PostRank::URI extract extracts mobile twitter links with hashbangs
     Failure/Error: if PublicSuffix.valid?(domain, default_rule: nil)

     ArgumentError:
       wrong number of arguments (given 2, expected 1)
     # ./lib/postrank-uri.rb:102:in `block in extract'
     # ./lib/postrank-uri.rb:100:in `scan'
     # ./lib/postrank-uri.rb:100:in `extract'
     # ./spec/postrank-uri_spec.rb:241:in `e'
     # ./spec/postrank-uri_spec.rb:259:in `block (3 levels) in <top (required)>'

  3) PostRank::URI extract handles a URL that comes after text without a space
     Failure/Error: if PublicSuffix.valid?(domain, default_rule: nil)

     ArgumentError:
       wrong number of arguments (given 2, expected 1)
     # ./lib/postrank-uri.rb:102:in `block in extract'
     # ./lib/postrank-uri.rb:100:in `scan'
     # ./lib/postrank-uri.rb:100:in `extract'
     # ./spec/postrank-uri_spec.rb:241:in `e'
     # ./spec/postrank-uri_spec.rb:263:in `block (3 levels) in <top (required)>'

  4) PostRank::URI extract does not pick up anything on or after the first . in the path of a URL with a shortener domain
     Failure/Error: if PublicSuffix.valid?(domain, default_rule: nil)

     ArgumentError:
       wrong number of arguments (given 2, expected 1)
     # ./lib/postrank-uri.rb:102:in `block in extract'
     # ./lib/postrank-uri.rb:100:in `scan'
     # ./lib/postrank-uri.rb:100:in `extract'
     # ./spec/postrank-uri_spec.rb:241:in `e'
     # ./spec/postrank-uri_spec.rb:270:in `block (3 levels) in <top (required)>'

  5) PostRank::URI extract picks up urls without protocol
     Failure/Error: if PublicSuffix.valid?(domain, default_rule: nil)

     ArgumentError:
       wrong number of arguments (given 2, expected 1)
     # ./lib/postrank-uri.rb:102:in `block in extract'
     # ./lib/postrank-uri.rb:100:in `scan'
     # ./lib/postrank-uri.rb:100:in `extract'
     # ./spec/postrank-uri_spec.rb:241:in `e'
     # ./spec/postrank-uri_spec.rb:274:in `block (3 levels) in <top (required)>'

  6) PostRank::URI extract picks up urls inside tags
     Failure/Error: if PublicSuffix.valid?(domain, default_rule: nil)

     ArgumentError:
       wrong number of arguments (given 2, expected 1)
     # ./lib/postrank-uri.rb:102:in `block in extract'
     # ./lib/postrank-uri.rb:100:in `scan'
     # ./lib/postrank-uri.rb:100:in `extract'
     # ./spec/postrank-uri_spec.rb:241:in `e'
     # ./spec/postrank-uri_spec.rb:280:in `block (3 levels) in <top (required)>'

  7) PostRank::URI extract TLDs does not pick up bad grammar as a domain name and think it has a link
     Failure/Error: if PublicSuffix.valid?(domain, default_rule: nil)

     ArgumentError:
       wrong number of arguments (given 2, expected 1)
     # ./lib/postrank-uri.rb:102:in `block in extract'
     # ./lib/postrank-uri.rb:100:in `scan'
     # ./lib/postrank-uri.rb:100:in `extract'
     # ./spec/postrank-uri_spec.rb:241:in `e'
     # ./spec/postrank-uri_spec.rb:246:in `block (4 levels) in <top (required)>'

  8) PostRank::URI extract TLDs does not pickup bad TLDS
     Failure/Error: if PublicSuffix.valid?(domain, default_rule: nil)

     ArgumentError:
       wrong number of arguments (given 2, expected 1)
     # ./lib/postrank-uri.rb:102:in `block in extract'
     # ./lib/postrank-uri.rb:100:in `scan'
     # ./lib/postrank-uri.rb:100:in `extract'
     # ./spec/postrank-uri_spec.rb:241:in `e'
     # ./spec/postrank-uri_spec.rb:250:in `block (4 levels) in <top (required)>'

  9) PostRank::URI extract multibyte characters stops extracting URLs at the full-width CJK space character
     Failure/Error: if PublicSuffix.valid?(domain, default_rule: nil)

     ArgumentError:
       wrong number of arguments (given 2, expected 1)
     # ./lib/postrank-uri.rb:102:in `block in extract'
     # ./lib/postrank-uri.rb:100:in `scan'
     # ./lib/postrank-uri.rb:100:in `extract'
     # ./spec/postrank-uri_spec.rb:241:in `e'
     # ./spec/postrank-uri_spec.rb:286:in `block (4 levels) in <top (required)>'

  10) PostRank::URI href extract domain extraction extracts "example.com" from http://alex.pages.example.com
      Failure/Error: (host && PublicSuffix.valid?(host, default_rule: nil)) ? PublicSuffix.parse(host).domain : nil

      ArgumentError:
        wrong number of arguments (given 2, expected 1)
      # ./lib/postrank-uri.rb:12:in `domain'
      # ./spec/postrank-uri_spec.rb:346:in `block (5 levels) in <top (required)>'

  11) PostRank::URI href extract domain extraction extracts "example.com" from alex.pages.example.com
      Failure/Error: (host && PublicSuffix.valid?(host, default_rule: nil)) ? PublicSuffix.parse(host).domain : nil

      ArgumentError:
        wrong number of arguments (given 2, expected 1)
      # ./lib/postrank-uri.rb:12:in `domain'
      # ./spec/postrank-uri_spec.rb:346:in `block (5 levels) in <top (required)>'

  12) PostRank::URI href extract domain extraction extracts "example.com" from http://example.com/2011/04/01/blah
      Failure/Error: (host && PublicSuffix.valid?(host, default_rule: nil)) ? PublicSuffix.parse(host).domain : nil

      ArgumentError:
        wrong number of arguments (given 2, expected 1)
      # ./lib/postrank-uri.rb:12:in `domain'
      # ./spec/postrank-uri_spec.rb:346:in `block (5 levels) in <top (required)>'

  13) PostRank::URI href extract domain extraction extracts "example.com" from http://example.com
      Failure/Error: (host && PublicSuffix.valid?(host, default_rule: nil)) ? PublicSuffix.parse(host).domain : nil

      ArgumentError:
        wrong number of arguments (given 2, expected 1)
      # ./lib/postrank-uri.rb:12:in `domain'
      # ./spec/postrank-uri_spec.rb:346:in `block (5 levels) in <top (required)>'

  14) PostRank::URI href extract domain extraction extracts "example.com" from example.com
      Failure/Error: (host && PublicSuffix.valid?(host, default_rule: nil)) ? PublicSuffix.parse(host).domain : nil

      ArgumentError:
        wrong number of arguments (given 2, expected 1)
      # ./lib/postrank-uri.rb:12:in `domain'
      # ./spec/postrank-uri_spec.rb:346:in `block (5 levels) in <top (required)>'

  15) PostRank::URI href extract domain extraction extracts "example.com" from ExampLe.com
      Failure/Error: (host && PublicSuffix.valid?(host, default_rule: nil)) ? PublicSuffix.parse(host).domain : nil

      ArgumentError:
        wrong number of arguments (given 2, expected 1)
      # ./lib/postrank-uri.rb:12:in `domain'
      # ./spec/postrank-uri_spec.rb:346:in `block (5 levels) in <top (required)>'

  16) PostRank::URI href extract domain extraction extracts "example.com" from ExampLe.com:3000
      Failure/Error: (host && PublicSuffix.valid?(host, default_rule: nil)) ? PublicSuffix.parse(host).domain : nil

      ArgumentError:
        wrong number of arguments (given 2, expected 1)
      # ./lib/postrank-uri.rb:12:in `domain'
      # ./spec/postrank-uri_spec.rb:346:in `block (5 levels) in <top (required)>'

  17) PostRank::URI href extract domain extraction extracts "example.com" from http://alex.pages.example.COM
      Failure/Error: (host && PublicSuffix.valid?(host, default_rule: nil)) ? PublicSuffix.parse(host).domain : nil

      ArgumentError:
        wrong number of arguments (given 2, expected 1)
      # ./lib/postrank-uri.rb:12:in `domain'
      # ./spec/postrank-uri_spec.rb:346:in `block (5 levels) in <top (required)>'

  18) PostRank::URI href extract domain extraction extracts "example.ag.it" from http://www.example.ag.it/2011/04/01/blah
      Failure/Error: (host && PublicSuffix.valid?(host, default_rule: nil)) ? PublicSuffix.parse(host).domain : nil

      ArgumentError:
        wrong number of arguments (given 2, expected 1)
      # ./lib/postrank-uri.rb:12:in `domain'
      # ./spec/postrank-uri_spec.rb:346:in `block (5 levels) in <top (required)>'

  19) PostRank::URI href extract domain extraction extracts "example.com" from ftp://www.example.com/2011/04/01/blah
      Failure/Error: (host && PublicSuffix.valid?(host, default_rule: nil)) ? PublicSuffix.parse(host).domain : nil

      ArgumentError:
        wrong number of arguments (given 2, expected 1)
      # ./lib/postrank-uri.rb:12:in `domain'
      # ./spec/postrank-uri_spec.rb:346:in `block (5 levels) in <top (required)>'

  20) PostRank::URI href extract domain extraction extracts nil from http://com
      Failure/Error: (host && PublicSuffix.valid?(host, default_rule: nil)) ? PublicSuffix.parse(host).domain : nil

      ArgumentError:
        wrong number of arguments (given 2, expected 1)
      # ./lib/postrank-uri.rb:12:in `domain'
      # ./spec/postrank-uri_spec.rb:346:in `block (5 levels) in <top (required)>'

  21) PostRank::URI href extract domain extraction extracts nil from http://alex.pages.examplecom
      Failure/Error: (host && PublicSuffix.valid?(host, default_rule: nil)) ? PublicSuffix.parse(host).domain : nil

      ArgumentError:
        wrong number of arguments (given 2, expected 1)
      # ./lib/postrank-uri.rb:12:in `domain'
      # ./spec/postrank-uri_spec.rb:346:in `block (5 levels) in <top (required)>'

  22) PostRank::URI href extract domain extraction extracts nil from http://127.0.0.1
      Failure/Error: (host && PublicSuffix.valid?(host, default_rule: nil)) ? PublicSuffix.parse(host).domain : nil

      ArgumentError:
        wrong number of arguments (given 2, expected 1)
      # ./lib/postrank-uri.rb:12:in `domain'
      # ./spec/postrank-uri_spec.rb:346:in `block (5 levels) in <top (required)>'

  23) PostRank::URI href extract domain extraction extracts "hello-there.com" from hello-there.com/you
      Failure/Error: (host && PublicSuffix.valid?(host, default_rule: nil)) ? PublicSuffix.parse(host).domain : nil

      ArgumentError:
        wrong number of arguments (given 2, expected 1)
      # ./lib/postrank-uri.rb:12:in `domain'
      # ./spec/postrank-uri_spec.rb:346:in `block (5 levels) in <top (required)>'

  24) PostRank::URI valid? marks www.test.c as invalid
      Failure/Error: is_valid = PublicSuffix.valid?(Addressable::IDNA.to_unicode(host), default_rule: nil)

      ArgumentError:
        wrong number of arguments (given 2, expected 1)
      # ./lib/postrank-uri.rb:234:in `valid?'
      # ./spec/postrank-uri_spec.rb:376:in `block (3 levels) in <top (required)>'

  25) PostRank::URI valid? marks www.test.com as valid
      Failure/Error: is_valid = PublicSuffix.valid?(Addressable::IDNA.to_unicode(host), default_rule: nil)

      ArgumentError:
        wrong number of arguments (given 2, expected 1)
      # ./lib/postrank-uri.rb:234:in `valid?'
      # ./spec/postrank-uri_spec.rb:380:in `block (3 levels) in <top (required)>'

  26) PostRank::URI valid? marks Unicode domain as valid (NOTE: works only with a scheme)
      Failure/Error: is_valid = PublicSuffix.valid?(Addressable::IDNA.to_unicode(host), default_rule: nil)

      ArgumentError:
        wrong number of arguments (given 2, expected 1)
      # ./lib/postrank-uri.rb:234:in `valid?'
      # ./spec/postrank-uri_spec.rb:384:in `block (3 levels) in <top (required)>'

  27) PostRank::URI valid? marks punycode domain domain as valid
      Failure/Error: is_valid = PublicSuffix.valid?(Addressable::IDNA.to_unicode(host), default_rule: nil)

      ArgumentError:
        wrong number of arguments (given 2, expected 1)
      # ./lib/postrank-uri.rb:234:in `valid?'
      # ./spec/postrank-uri_spec.rb:388:in `block (3 levels) in <top (required)>'

Finished in 0.08885 seconds (files took 0.30796 seconds to load)
71 examples, 27 failures

Failed examples:

rspec ./spec/postrank-uri_spec.rb:254 # PostRank::URI extract extracts twitter links with hashbangs
rspec ./spec/postrank-uri_spec.rb:258 # PostRank::URI extract extracts mobile twitter links with hashbangs
rspec ./spec/postrank-uri_spec.rb:262 # PostRank::URI extract handles a URL that comes after text without a space
rspec ./spec/postrank-uri_spec.rb:269 # PostRank::URI extract does not pick up anything on or after the first . in the path of a URL with a shortener domain
rspec ./spec/postrank-uri_spec.rb:273 # PostRank::URI extract picks up urls without protocol
rspec ./spec/postrank-uri_spec.rb:279 # PostRank::URI extract picks up urls inside tags
rspec ./spec/postrank-uri_spec.rb:245 # PostRank::URI extract TLDs does not pick up bad grammar as a domain name and think it has a link
rspec ./spec/postrank-uri_spec.rb:249 # PostRank::URI extract TLDs does not pickup bad TLDS
rspec ./spec/postrank-uri_spec.rb:285 # PostRank::URI extract multibyte characters stops extracting URLs at the full-width CJK space character
rspec './spec/postrank-uri_spec.rb[1:8:4:1]' # PostRank::URI href extract domain extraction extracts "example.com" from http://alex.pages.example.com
rspec './spec/postrank-uri_spec.rb[1:8:4:2]' # PostRank::URI href extract domain extraction extracts "example.com" from alex.pages.example.com
rspec './spec/postrank-uri_spec.rb[1:8:4:3]' # PostRank::URI href extract domain extraction extracts "example.com" from http://example.com/2011/04/01/blah
rspec './spec/postrank-uri_spec.rb[1:8:4:4]' # PostRank::URI href extract domain extraction extracts "example.com" from http://example.com
rspec './spec/postrank-uri_spec.rb[1:8:4:5]' # PostRank::URI href extract domain extraction extracts "example.com" from example.com
rspec './spec/postrank-uri_spec.rb[1:8:4:6]' # PostRank::URI href extract domain extraction extracts "example.com" from ExampLe.com
rspec './spec/postrank-uri_spec.rb[1:8:4:7]' # PostRank::URI href extract domain extraction extracts "example.com" from ExampLe.com:3000
rspec './spec/postrank-uri_spec.rb[1:8:4:8]' # PostRank::URI href extract domain extraction extracts "example.com" from http://alex.pages.example.COM
rspec './spec/postrank-uri_spec.rb[1:8:4:9]' # PostRank::URI href extract domain extraction extracts "example.ag.it" from http://www.example.ag.it/2011/04/01/blah
rspec './spec/postrank-uri_spec.rb[1:8:4:10]' # PostRank::URI href extract domain extraction extracts "example.com" from ftp://www.example.com/2011/04/01/blah
rspec './spec/postrank-uri_spec.rb[1:8:4:11]' # PostRank::URI href extract domain extraction extracts nil from http://com
rspec './spec/postrank-uri_spec.rb[1:8:4:12]' # PostRank::URI href extract domain extraction extracts nil from http://alex.pages.examplecom
rspec './spec/postrank-uri_spec.rb[1:8:4:14]' # PostRank::URI href extract domain extraction extracts nil from http://127.0.0.1
rspec './spec/postrank-uri_spec.rb[1:8:4:16]' # PostRank::URI href extract domain extraction extracts "hello-there.com" from hello-there.com/you
rspec ./spec/postrank-uri_spec.rb:375 # PostRank::URI valid? marks www.test.c as invalid
rspec ./spec/postrank-uri_spec.rb:379 # PostRank::URI valid? marks www.test.com as valid
rspec ./spec/postrank-uri_spec.rb:383 # PostRank::URI valid? marks Unicode domain as valid (NOTE: works only with a scheme)
rspec ./spec/postrank-uri_spec.rb:387 # PostRank::URI valid? marks punycode domain domain as valid

Everything still passes fine under 2.7.4, so this only affects 3.0.0+

nehagupta93 commented 2 years ago

I fixed this in my Rails project by tweaking the gem versions in my Gemfile.lock and then running bundle i.

    postrank-uri (1.0.24)
      addressable (>= 2.4.0)
      nokogiri (>= 1.8.0)
-     public_suffix (>= 2.0.0, < 2.1)
+     public_suffix (>= 4.0.0, < 5)
-   public_suffix (2.0.5)
+   public_suffix (4.0.6)