sferik / twitter-ruby

A Ruby interface to the Twitter API.
http://www.rubydoc.info/gems/twitter
MIT License
4.58k stars 1.31k forks source link

stream client proxy not working #981

Closed kares closed 1 year ago

kares commented 4 years ago

I've tried setting up a proxy with the streaming client and it does not work, sample script:

require 'twitter'
require 'twitter/version'

puts "twitter gem: #{Twitter::Version.to_s}"

def configure(client)
  client.consumer_key = '...'
  client.consumer_secret = '...'
  client.access_token = '...'
  client.access_token_secret = '...'
  client
end

stream_client = configure Twitter::Streaming::Client.new
stream_client.proxy = { proxy_address: '192.168.56.1', proxy_port: 3128 }

rest_client  = configure Twitter::REST::Client.new
rest_client.proxy = { host: '192.168.56.1', port: 3128 }

user = rest_client.user('tenderlove')
p user; puts user.uri # WORKS

puts "filtering (using stream client): #{stream_client}"
stream_client.filter({ track: "xxx" }) do |tweet|
  p tweet.full_text; puts tweet.uri # FAILS WITH A PROXY
  exit 1
end

Have tested this setup using twitter gem 6.2.0 as well as latest (7.0.0) on MRI 2.5.7 and JRuby 9.2.

Twitter::Error::BadRequest: 
  on_headers_complete at /home/borg/jruby-9.2.13.0/lib/ruby/gems/shared/gems/twitter-7.0.0/lib/twitter/streaming/response.rb:24
                   << at org/ruby_http_parser/RubyHttpParser.java:370
                   << at /home/borg/jruby-9.2.13.0/lib/ruby/gems/shared/gems/twitter-7.0.0/lib/twitter/streaming/response.rb:19
               stream at /home/borg/jruby-9.2.13.0/lib/ruby/gems/shared/gems/twitter-7.0.0/lib/twitter/streaming/connection.rb:25
                 loop at org/jruby/RubyKernel.java:1442
               stream at /home/borg/jruby-9.2.13.0/lib/ruby/gems/shared/gems/twitter-7.0.0/lib/twitter/streaming/connection.rb:21
              request at /home/borg/jruby-9.2.13.0/lib/ruby/gems/shared/gems/twitter-7.0.0/lib/twitter/streaming/client.rb:123
               filter at /home/borg/jruby-9.2.13.0/lib/ruby/gems/shared/gems/twitter-7.0.0/lib/twitter/streaming/client.rb:38

Believe the problem might be with the http gem, potentially not handling the proxy setup for streaming connection? I was able to verify the proxy isn't an issue, I've tried Python's (tweepy) client with a similar setup using a proxy. The proxy is a standard Squid installation on ubuntu, listening on its default 3128 port.

The Ruby client seems to be trying a POST request for streaming which gets rejected:

2020/09/29 14:58:54.410 kid1| 11,2| client_side.cc(1302) parseHttpRequest: HTTP Client local=192.168.56.1:3128 remote=192.168.56.102:33692 FD 11 flags=1
2020/09/29 14:58:54.410 kid1| 11,2| client_side.cc(1303) parseHttpRequest: HTTP Client REQUEST:
---------
POST /1.1/statuses/filter.json?track=xxx HTTP/1.1
User-Agent: TwitterRubyGem/7.0.0
Authorization: OAuth oauth_consumer_key="...", oauth_nonce="...", oauth_signature="...", oauth_signature_method="HMAC-SHA1", oauth_timestamp="1601384334", oauth_token="16308531-...", oauth_version="1.0"
Host: stream.twitter.com
Content-Length: 0

----------
2020/09/29 14:58:54.411 kid1| 11,2| Stream.cc(266) sendStartOfMessage: HTTP Client local=192.168.56.1:3128 remote=192.168.56.102:33692 FD 11 flags=1
2020/09/29 14:58:54.411 kid1| 11,2| Stream.cc(267) sendStartOfMessage: HTTP Client REPLY:
---------
HTTP/1.1 400 Bad Request
Server: squid/4.10
Mime-Version: 1.0
Date: Tue, 29 Sep 2020 12:58:54 GMT
Content-Type: text/html;charset=utf-8
Content-Length: 3575
X-Squid-Error: ERR_INVALID_URL 0
Vary: Accept-Language
Content-Language: en
X-Cache: MISS from precision
X-Cache-Lookup: NONE from precision:3128
Via: 1.1 precision (squid/4.10)
Connection: close

----------

While the Python client simply issues a CONNECT:

2020/09/29 15:25:15.642 kid1| 11,2| client_side.cc(1302) parseHttpRequest: HTTP Client local=192.168.56.1:3128 remote=192.168.56.102:33704 FD 11 flags=1
2020/09/29 15:25:15.642 kid1| 11,2| client_side.cc(1303) parseHttpRequest: HTTP Client REQUEST:
---------
CONNECT stream.twitter.com:443 HTTP/1.0

----------

Also the proxy settings are bit confusing as they're different based on the client being used. Which is a shame considering most of the configuration is the same for both the rest and streaming client:

if client.is_a?(Twitter::REST::Client)
  client.proxy = { host: @proxy_address, port: @proxy_port }
else
  client.proxy = { proxy_address: @proxy_address, proxy_port: @proxy_port }
end
kares commented 4 years ago

an older version of the gem (5.15.0 + http.rb 0.9.9) seems to be working, proxy log:

2020/10/05 11:54:47.807 kid1| 11,2| client_side.cc(1302) parseHttpRequest: HTTP Client local=127.0.0.1:3128 remote=127.0.0.1:58600 FD 11 flags=1
2020/10/05 11:54:47.807 kid1| 11,2| client_side.cc(1303) parseHttpRequest: HTTP Client REQUEST:
---------
POST https://stream.twitter.com/1.1/statuses/filter.json?track=xxx HTTP/1.1
Authorization: OAuth oauth_consumer_key="...", oauth_nonce="...", oauth_signature="...", oauth_signature_method="HMAC-SHA1", oauth_timestamp="1601891687", oauth_token="...", oauth_version="1.0"
Host: stream.twitter.com
User-Agent: http.rb/0.9.9

----------
2020/10/05 11:54:48.274 kid1| 11,3| http.cc(2310) httpStart: POST https://stream.twitter.com/1.1/statuses/filter.json?track=London,Barcelona
2020/10/05 11:54:48.275 kid1| 11,2| http.cc(2266) sendRequest: HTTP Server local=192.168.0.25:46560 remote=199.16.156.200:443 FD 14 flags=1
2020/10/05 11:54:48.275 kid1| 11,2| http.cc(2267) sendRequest: HTTP Server REQUEST:
---------
POST /1.1/statuses/filter.json?track=xxx HTTP/1.1
Authorization: OAuth oauth_consumer_key="...", oauth_nonce="59e85cfa31e58cb41085595bb3688f74", oauth_signature="...", oauth_signature_method="HMAC-SHA1", oauth_timestamp="1601891687", oauth_token="...", oauth_version="1.0"
User-Agent: http.rb/0.9.9
Host: stream.twitter.com
Via: 1.1 precision (squid/4.10)
X-Forwarded-For: 127.0.0.1
Cache-Control: max-age=0
Connection: keep-alive

----------

the only difference seems to be Content-Length: 0 and sending a full URL: POST https://stream.twitter.com/1.1/statuses/filter.json?track=xxx HTTP/1.1

kares commented 4 years ago

:green_salad: confirmed, patching HTTP to send full URIs via proxy seems to resolve the issue: (note that to connect to twitter's streaming API we're using https://)

::HTTP::Request.class_eval do
  def headline
    request_uri =
        if using_proxy? #&& !uri.https?
          uri.omit(:fragment)
        else
          uri.request_uri
        end

    "#{verb.to_s.upcase} #{request_uri} HTTP/#{version}"
  end
end

... the patch essentially undoes: https://github.com/httprb/http/pull/333 (which has been shipped since http.rb 2.0.0)

kares commented 4 years ago

to recap, this is a (squid) proxy issue (https://github.com/httprb/http/pull/333 seems legit), still, annoying that things got broken with various proxies out there in the 6.x/7.0 line ...

cmirnow commented 3 years ago

Working through a proxy, I get an error: cannot interpret as DNS name: nil (gem 'twitter' 7.0.0). Reproduce:

 def twi_sclient(t)
    Twitter::Streaming::Client.new config(t)
  end

  def config(t)
    {
      consumer_key: t.key,
      consumer_secret: t.secret,
      access_token: t.token,
      access_token_secret: t.token_secret,
      proxy: proxy(t)
    }
  end

  def proxy(t)
      {
        host: t.host,
        port: t.port,
        username: t.username,
        password: t.password
      }
  end

[ActiveJob] [RetweetsJob] [d564b4c5-6f04-4191-bd69-bba02a4965e0] Error performing RetweetsJob (Job ID: d564b4c5-6f04-4191-bd69-bba02a4965e0) from Async(default) in 15.55ms: ArgumentError (cannot interpret as DNS name: nil)

Any ideas? Thank you.