sferik / twitter-ruby

A Ruby interface to the Twitter API.
MIT License
4.58k stars 1.31k forks source link

stream client proxy not working #981

Closed kares closed 1 year ago

kares commented 4 years ago

I've tried setting up a proxy with the streaming client and it does not work, sample script:

require 'twitter'
require 'twitter/version'

puts "twitter gem: #{Twitter::Version.to_s}"

def configure(client)
  client.consumer_key = '...'
  client.consumer_secret = '...'
  client.access_token = '...'
  client.access_token_secret = '...'

stream_client = configure Twitter::Streaming::Client.new
stream_client.proxy = { proxy_address: '', proxy_port: 3128 }

rest_client  = configure Twitter::REST::Client.new
rest_client.proxy = { host: '', port: 3128 }

user = rest_client.user('tenderlove')
p user; puts user.uri # WORKS

puts "filtering (using stream client): #{stream_client}"
stream_client.filter({ track: "xxx" }) do |tweet|
  p tweet.full_text; puts tweet.uri # FAILS WITH A PROXY
  exit 1

Have tested this setup using twitter gem 6.2.0 as well as latest (7.0.0) on MRI 2.5.7 and JRuby 9.2.

  on_headers_complete at /home/borg/jruby-
                   << at org/ruby_http_parser/RubyHttpParser.java:370
                   << at /home/borg/jruby-
               stream at /home/borg/jruby-
                 loop at org/jruby/RubyKernel.java:1442
               stream at /home/borg/jruby-
              request at /home/borg/jruby-
               filter at /home/borg/jruby-

Believe the problem might be with the http gem, potentially not handling the proxy setup for streaming connection? I was able to verify the proxy isn't an issue, I've tried Python's (tweepy) client with a similar setup using a proxy. The proxy is a standard Squid installation on ubuntu, listening on its default 3128 port.

The Ruby client seems to be trying a POST request for streaming which gets rejected:

2020/09/29 14:58:54.410 kid1| 11,2| client_side.cc(1302) parseHttpRequest: HTTP Client local= remote= FD 11 flags=1
2020/09/29 14:58:54.410 kid1| 11,2| client_side.cc(1303) parseHttpRequest: HTTP Client REQUEST:
POST /1.1/statuses/filter.json?track=xxx HTTP/1.1
User-Agent: TwitterRubyGem/7.0.0
Authorization: OAuth oauth_consumer_key="...", oauth_nonce="...", oauth_signature="...", oauth_signature_method="HMAC-SHA1", oauth_timestamp="1601384334", oauth_token="16308531-...", oauth_version="1.0"
Host: stream.twitter.com
Content-Length: 0

2020/09/29 14:58:54.411 kid1| 11,2| Stream.cc(266) sendStartOfMessage: HTTP Client local= remote= FD 11 flags=1
2020/09/29 14:58:54.411 kid1| 11,2| Stream.cc(267) sendStartOfMessage: HTTP Client REPLY:
HTTP/1.1 400 Bad Request
Server: squid/4.10
Mime-Version: 1.0
Date: Tue, 29 Sep 2020 12:58:54 GMT
Content-Type: text/html;charset=utf-8
Content-Length: 3575
X-Squid-Error: ERR_INVALID_URL 0
Vary: Accept-Language
Content-Language: en
X-Cache: MISS from precision
X-Cache-Lookup: NONE from precision:3128
Via: 1.1 precision (squid/4.10)
Connection: close


While the Python client simply issues a CONNECT:

2020/09/29 15:25:15.642 kid1| 11,2| client_side.cc(1302) parseHttpRequest: HTTP Client local= remote= FD 11 flags=1
2020/09/29 15:25:15.642 kid1| 11,2| client_side.cc(1303) parseHttpRequest: HTTP Client REQUEST:
CONNECT stream.twitter.com:443 HTTP/1.0


Also the proxy settings are bit confusing as they're different based on the client being used. Which is a shame considering most of the configuration is the same for both the rest and streaming client:

if client.is_a?(Twitter::REST::Client)
  client.proxy = { host: @proxy_address, port: @proxy_port }
  client.proxy = { proxy_address: @proxy_address, proxy_port: @proxy_port }
kares commented 4 years ago

an older version of the gem (5.15.0 + http.rb 0.9.9) seems to be working, proxy log:

2020/10/05 11:54:47.807 kid1| 11,2| client_side.cc(1302) parseHttpRequest: HTTP Client local= remote= FD 11 flags=1
2020/10/05 11:54:47.807 kid1| 11,2| client_side.cc(1303) parseHttpRequest: HTTP Client REQUEST:
POST https://stream.twitter.com/1.1/statuses/filter.json?track=xxx HTTP/1.1
Authorization: OAuth oauth_consumer_key="...", oauth_nonce="...", oauth_signature="...", oauth_signature_method="HMAC-SHA1", oauth_timestamp="1601891687", oauth_token="...", oauth_version="1.0"
Host: stream.twitter.com
User-Agent: http.rb/0.9.9

2020/10/05 11:54:48.274 kid1| 11,3| http.cc(2310) httpStart: POST https://stream.twitter.com/1.1/statuses/filter.json?track=London,Barcelona
2020/10/05 11:54:48.275 kid1| 11,2| http.cc(2266) sendRequest: HTTP Server local= remote= FD 14 flags=1
2020/10/05 11:54:48.275 kid1| 11,2| http.cc(2267) sendRequest: HTTP Server REQUEST:
POST /1.1/statuses/filter.json?track=xxx HTTP/1.1
Authorization: OAuth oauth_consumer_key="...", oauth_nonce="59e85cfa31e58cb41085595bb3688f74", oauth_signature="...", oauth_signature_method="HMAC-SHA1", oauth_timestamp="1601891687", oauth_token="...", oauth_version="1.0"
User-Agent: http.rb/0.9.9
Host: stream.twitter.com
Via: 1.1 precision (squid/4.10)
Cache-Control: max-age=0
Connection: keep-alive


the only difference seems to be Content-Length: 0 and sending a full URL: POST https://stream.twitter.com/1.1/statuses/filter.json?track=xxx HTTP/1.1

kares commented 4 years ago

:green_salad: confirmed, patching HTTP to send full URIs via proxy seems to resolve the issue: (note that to connect to twitter's streaming API we're using https://)

::HTTP::Request.class_eval do
  def headline
    request_uri =
        if using_proxy? #&& !uri.https?

    "#{verb.to_s.upcase} #{request_uri} HTTP/#{version}"

... the patch essentially undoes: https://github.com/httprb/http/pull/333 (which has been shipped since http.rb 2.0.0)

kares commented 4 years ago

to recap, this is a (squid) proxy issue (https://github.com/httprb/http/pull/333 seems legit), still, annoying that things got broken with various proxies out there in the 6.x/7.0 line ...

cmirnow commented 3 years ago

Working through a proxy, I get an error: cannot interpret as DNS name: nil (gem 'twitter' 7.0.0). Reproduce:

 def twi_sclient(t)
    Twitter::Streaming::Client.new config(t)

  def config(t)
      consumer_key: t.key,
      consumer_secret: t.secret,
      access_token: t.token,
      access_token_secret: t.token_secret,
      proxy: proxy(t)

  def proxy(t)
        host: t.host,
        port: t.port,
        username: t.username,
        password: t.password

[ActiveJob] [RetweetsJob] [d564b4c5-6f04-4191-bd69-bba02a4965e0] Error performing RetweetsJob (Job ID: d564b4c5-6f04-4191-bd69-bba02a4965e0) from Async(default) in 15.55ms: ArgumentError (cannot interpret as DNS name: nil)

Any ideas? Thank you.