oneclick / rubyinstaller2

MSYS2 based RubyInstaller for Windows
https://rubyinstaller.org
BSD 3-Clause "New" or "Revised" License
646 stars 248 forks source link

Windows Ruby 3.1 net/http IPv6 HTTP(S) requests are much slower than Ruby 3.0 #264

Open johnnyshields opened 2 years ago

johnnyshields commented 2 years ago

What problems are you experiencing?

Steps to reproduce

require 'net/http'

def do_request(url)
  uri = URI(url)
  Net::HTTP.start(uri.host, uri.port, use_ssl: uri.scheme == 'https') do |http|
    request = Net::HTTP::Get.new uri
    response = http.request request # <-- very slow for IPv6
  end
end

do_request('https://www.google.com') # <-- very slow for IPv6
do_request('https://global.jd.com') # <-- fast for IPv4

What's the output from ridk version?

---
ruby:
  path: C:/ruby31
  version: 3.1.1
  platform: x64-mingw-ucrt
  cc: gcc.exe (Rev9, Built by MSYS2 project) 11.2.0
ruby_installer:
  package_version: 3.1.1-1
  git_commit: d9d39f1
msys2:
  path: C:\ruby31\msys64
cc: gcc (Rev10, Built by MSYS2 project) 11.2.0
sh: GNU bash, version 5.1.16(1)-release (x86_64-pc-msys)
os: Microsoft Windows [Version 10.0.22000.556]

Original Ticket: Rubgems is slow on Ruby 3.1

Installed RubyInstaller 3.1.1-1. Running gem install and bundle install seems noticeably slower than on Ruby 3.0. Even gem update --system is very slow; takes several minutes before any log lines print at all.

Doing gem update --system (update to rubygems-3.3.9) doesn't seem to make any difference.

I ran some basic code benchmarks on irb and there wasn't a major performance difference between the two versions.

johnnyshields commented 2 years ago

@larskanis any thoughts on this?

johnnyshields commented 2 years ago

OK, I've determined that its due to HTTPS being slow. This script reproduces the issue

require 'net/http'

uri = URI('https://www.google.com')

Net::HTTP.start(uri.host, uri.port, use_ssl: uri.scheme == 'https') do |http|
  request = Net::HTTP::Get.new uri
  response = http.request request # very slow
end

Edited ticket description to reflect this.

johnnyshields commented 2 years ago

Tried seeing if it's DNS related, doesn't look to be. Here's a script to use google DNS

class Resolv
  def self.use_google_dns
    # dns
    nameservers = ['8.8.8.8', '8.8.4.4']
    dns = Resolv::DNS.new(nameserver: nameservers, search: ['mydns.com'], ndots: 1)

    # resolver
    hosts = Resolv::Hosts.new()
    resolver = new([hosts, dns])

    # replace constant
    remove_const :DefaultResolver
    const_set :DefaultResolver, resolver
  end
end

# replace
require 'resolv-replace'

# use google's DNS servers
Resolv.use_google_dns
johnnyshields commented 2 years ago

@larskanis this seems IPv6 related. The following URL which only supports IPv4 is very fast:

require 'net/http'

uri = URI('https://global.jd.com')

Net::HTTP.start(uri.host, uri.port, use_ssl: uri.scheme == 'https') do |http|
  request = Net::HTTP::Get.new uri
  response = http.request request # <-- fast!
end

Adding to description.

MSP-Greg commented 2 years ago

@johnnyshields I tried, but I'm not seeing the same issue:

ruby 3.1.2p20 (2022-04-12 revision 4491bb740a) [x64-mingw-ucrt]

https://www.google.com
body bytesize 14049
response time  0.145 sec
#<Addrinfo: [2607:f8b0:4009:807::2004]:443 TCP>

https://msp-greg.github.io
body bytesize 22161
response time  0.045 sec
#<Addrinfo: [2606:50c0:8000::153]:443 TCP>

https://global.jd.com
body bytesize 20592
response time  0.190 sec
#<Addrinfo: 157.185.170.144:443 TCP>

Code:

require 'uri'
require 'net/http'
require 'openssl'

module HTTPS_IPv4_IPv6
  class << self

    def run(uri_str)
      uri = URI(uri_str)

      puts uri_str

      t_st = Process.clock_gettime(Process::CLOCK_MONOTONIC)

      Net::HTTP.start(uri.host, uri.port, use_ssl: true, verify_mode: OpenSSL::SSL::VERIFY_PEER) do |http|
        req = Net::HTTP::Get.new uri.request_uri
        resp = http.request req
        case resp
        when Net::HTTPSuccess
          body = resp.body
          t_end = Process.clock_gettime(Process::CLOCK_MONOTONIC)
          puts "body bytesize #{body.bytesize}"
          puts format("response time %6.3f sec", t_end - t_st)
          puts http.instance_variable_get(:@socket).io.to_io.remote_address.inspect, ''
        when Net::HTTPRedirection
          puts "#{resp.class}\n   Redirect: #{resp['location']}"
          run resp['location']
        else
          puts resp.class
        end
      end
    end

  end
end

puts '', RUBY_DESCRIPTION, ''

HTTPS_IPv4_IPv6.run 'https://www.google.com'

HTTPS_IPv4_IPv6.run 'https://msp-greg.github.io'

HTTPS_IPv4_IPv6.run 'https://global.jd.com'
johnnyshields commented 2 years ago

Thank you for this! Output I get from your script (* using OpenSSL::SSL::VERIFY_NONE)

ruby 3.1.1p18 (2022-02-18 revision 53f5fc4236) [x64-mingw-ucrt]

irb(main):079:0> HTTPS_IPv4_IPv6.run 'https://www.google.com'
https://www.google.com
body bytesize 15108
response time 60.162 sec
#<Addrinfo: 172.217.26.228:443 TCP>

irb(main):084:0> HTTPS_IPv4_IPv6.run 'https://msp-greg.github.io'
https://msp-greg.github.io
body bytesize 22161
response time 240.197 sec
#<Addrinfo: 185.199.110.153:443 TCP>

irb(main):081:0> HTTPS_IPv4_IPv6.run 'https://global.jd.com'
https://global.jd.com
body bytesize 20592
response time  0.384 sec
#<Addrinfo: 14.0.43.163:443 TCP>

The "60 sec + X milliseconds" in the google.com run (IPv6) is very curious. Verified it's always 60 sec. Must be something non-critical is timing out after 60 sec, need to investigate what is causing it.

Something is def strange because Ruby 3.0, 2.7, 2.6, etc. on my machine are all installed with Ruby Windows Installer and all of them work fine.

I'm in Tokyo, Japan, by the way, so maybe that has something to with it.

MSP-Greg commented 2 years ago

It is interesting that you're having issues. I first was using ruby master, and the times were similar.

Do you see any difference using ping -6 www.google.com and ping -4 www.google.com?

EDIT: It looks like your network isn't using IPv6, all the addresses are IPv4? Note that my output has IPv6 addresses for Google and GitHub, but IPv4 for global.jd.com.

johnnyshields commented 2 years ago

ping -6 www.google.com is timing out, -4 is working.

My hunch is that Ruby is failing IPv6 entirely and then falling back to IPv4 after 60 secs.

With Ruby 3.0, I get this--looks like it's also using IPv4, but not doing a long timeout for IPv6. So something is causing Ruby 3.1 to wait for up to 60 sec for the IPv6 timeout.

ruby 3.0.0p0 (2020-12-25 revision 95aff21468) [x64-mingw32]

irb(main):040:0> HTTPS_IPv4_IPv6.run 'https://www.google.com'
https://www.google.com
body bytesize 15110
response time  2.390 sec
#<Addrinfo: 142.251.42.196:443 TCP>

irb(main):042:0> HTTPS_IPv4_IPv6.run 'https://msp-greg.github.io'
https://msp-greg.github.io
body bytesize 22161
response time  0.207 sec
#<Addrinfo: 185.199.110.153:443 TCP>

irb(main):044:0> HTTPS_IPv4_IPv6.run 'https://global.jd.com'
https://global.jd.com
body bytesize 20592
response time  0.633 sec
#<Addrinfo: 14.0.43.163:443 TCP>
MSP-Greg commented 2 years ago

So something is causing Ruby 3.1 to wait for up to 60 sec for the IPv6 timeout.

Can you test on any other platforms? Not sure if it's a Windows issue. Since my network is different, not sure if I can repo what you're seeing...

I've got WSL2/Ubuntu, both 20.04 and 22.04, but it's only running IPv4. I guess I should figure out how to change that...

johnnyshields commented 2 years ago

Yes, I'll check investigate more in next few days. Based on the info we have, my hypothesis is:

johnnyshields commented 2 years ago

Confirmed that Ruby 3.1 on WSL Linux (same machine) does not have this issue. As per above it is using IPv4 only, as expected since it's the same machine.

ruby 3.1.2p20 (2022-04-12 revision 4491bb740a) [x86_64-linux]

3.1.2 :040 > HTTPS_IPv4_IPv6.run 'https://www.google.com'
https://www.google.com
body bytesize 15083
response time  0.141 sec
#<Addrinfo: 142.251.42.196:443 TCP>

3.1.2 :042 > HTTPS_IPv4_IPv6.run 'https://msp-greg.github.io'
https://msp-greg.github.io
body bytesize 22161
response time  0.188 sec
#<Addrinfo: 185.199.110.153:443 TCP>

3.1.2 :044 > HTTPS_IPv4_IPv6.run 'https://global.jd.com'
https://global.jd.com
body bytesize 20592
response time  1.649 sec
#<Addrinfo: 14.0.43.163:443 TCP>
MSP-Greg commented 2 years ago

Question - in your network adapter properties, is IPv6 checked/enabled? If so, and if you uncheck it, do the Windows connection times drop?

JFYI, haven't gotten IPv6 working with WSL2, not even sure it's possible..

johnnyshields commented 2 years ago

Disabling IPv6 on adapter does not seem to have any effect--I see the 60s latency either way.

MSP-Greg commented 2 years ago

Interesting. JFYI, I tried the most recent releases back to 2.5, and all connected to the IPv6 addresses.

deivid-rodriguez commented 2 years ago

I'm surprised that setting IPV4_FALLBACK_ENABLED does not fix the problem since the symptoms presented here are exactly what it was supposed to fix, if I recall correctly.

johnnyshields commented 2 years ago

@MSP-Greg here's a bit more pinpointing of the issue. It's happening in Socket.tcp, my hunch is that there's some difference in the native code that Socket uses between the two Ruby versions. AFAIK the socket timeout itself is set to 60 sec in both Ruby 3.0 and Ruby 3.1, the difference is that Ruby 3.0 somehow "short-circuits" that timeout immediately, while Ruby 3.1 waits until the end.

require 'net/http'

TIMEOUT = 5

# This code is where the issue happens
# s = Socket.tcp "google.com", 80, nil, nil, connect_timeout: TIMEOUT

# Demonstrates the issue inside Socket.tcp
puts '', RUBY_DESCRIPTION, ''
t = nil
Addrinfo.foreach("google.com", 80, nil, :STREAM) do |ai|
  puts ai.inspect
  t = Time.now
  ai.connect(timeout: TIMEOUT)
rescue SystemCallError
  puts "Failed after #{Time.now - t} sec", ''
end
puts "Success after #{Time.now - t} sec"
ruby 3.1.1p18 (2022-02-18 revision 53f5fc4236) [x64-mingw-ucrt]

#<Addrinfo: [2404:6800:4004:822::200e]:80 TCP (google.com)>
Failed after 5.0140926 sec

#<Addrinfo: 172.217.175.110:80 TCP (google.com)>
Success after 0.0454041 sec
johnnyshields commented 2 years ago

OK, think I have a clue.

Under the hood, Ruby 3.0 Net::HTTP uses:

      # net/http.rb line 985
      s = Timeout.timeout(@open_timeout, Net::OpenTimeout) {
        begin
          TCPSocket.open(conn_addr, conn_port, @local_host, @local_port)

While Ruby 3.1 Net::HTTP uses:

        # net/http.rb line 998
        s = Socket.tcp conn_addr, conn_port, @local_host, @local_port, connect_timeout: @open_timeout

For me, TCPSocket.open is fast on both versions, and Socket.tcp is slow on both versions.

Socket.tcp "google.com", 80, nil, nil, connect_timeout: 60

TCPSocket.open("google.com", 80, nil, nil)
johnnyshields commented 2 years ago

And here is the change that did it: https://github.com/ruby/net-http/pull/10

johnnyshields commented 2 years ago

Issue filed with Net::HTTP: https://github.com/ruby/net-http/issues/57

johnnyshields commented 2 years ago

Well, at least I can monkey-patch in the old Net::HTTP#connect method for now!

deivid-rodriguez commented 2 years ago

This is interesting for RubyGems, because we're monkey-patching TCPSocket#initialize to fix similar issues and I think net-http using Socket.tcp directly now means the monkey patch no longer has the desired effect... 🤔

MSP-Greg commented 2 years ago

IDK. With a network (including ISP) fully supporting IPv6, there aren't issues. But, maybe the timeouts in http are a bit long for 'modern day' networks?

@open_timeout = 60
@read_timeout = 60
@write_timeout = 60
deivid-rodriguez commented 2 years ago

If I recall correctly, the problem was that Ruby networking APIs do not implement happy eyeballs. Our monkey patch in RubyGems is intended to workaround that. There was some work in progress in ruby-core to implement this but it was never completed: https://github.com/ruby/ruby/pull/4038.

deivid-rodriguez commented 2 years ago

@sonalkr132 I think our monkey-patch to TCPSocket#initialize is no longer fixing this issue due to underlying changes in net/http, so this became a problem again that can't be easily workaround 😞. I guess we should try to adapt the monkey patch to Socket.tcp but I think it might be too low level for us to be able to patch it.

mirelon commented 1 year ago

Same issue on Linux machine, ruby 3.1. curl works well, wget times out, wget --inet4-only works well. Force ipv4 should be an option, but seems it is not. The most simple solution I could find was changing:

URI.open(address)

to:

URI.open(address, open_timeout: 2)
deivid-rodriguez commented 1 year ago

For what it's worth, the net/http change was reverted at https://github.com/ruby/net-http/pull/74, so I believe gem install net-http should also fix this?