Netflix-Skunkworks / Scumblr

Web framework that allows performing periodic syncs of data sources and performing analysis on the identified results
Apache License 2.0
2.64k stars 318 forks source link

Sketchy won't upload screenshots to Scumblr #29

Closed siggy86 closed 9 years ago

siggy86 commented 9 years ago

I have Scumblr running in socket via unicorn and nginx, when I attempt to generate a screenshot with sketchy it sends the request through sidekiq fine. The issue is its not posting the screenshot back to sketchy. Logs look fine but at https://sketchy/api/v1.0/capture I get:

windows 7 x64-2015-02-10-15-49-42

I tried changing /config/environment/development.rb to the port scumblr is running on in nginx and I get:

windows 7 x64-2015-02-10-15-49-43

ahoernecke commented 9 years ago

It looks like the callback url that scumblr is generating is "localhost". Are sketchy and scumblr running on the same host, and if so, is scumblr listening on the loopback address?

siggy86 commented 9 years ago

They are both on the same host, like I said though scumblr is running on a unix socket through unicorn, and an upstream in nginx. I don't know how I could get it to also listen on the localhost with the way its setup.

ahoernecke commented 9 years ago

Let's make sure that what this log is indicating is correct configuration for your setup. Sketchy is available via HTTPS on port 8000? Scumblr is running on the same host and is available (via nginx) over HTTP on port 80? A couple things to double check:

  1. You can visit scumblr via HTTP on port 80 on the sketchy host and it loads ok
  2. You can retrieve the image/scrape via the URLs sketchy is providing. (Replace the 127.0.0.1 with the hosts IP if you're trying to download from a different system) https://127.0.0.1:8000/files/www.pcworld.com_7.txt https://127.0.0.1:8000/files/www.pcworld.com_7.png
  3. Are you seeing the connect attempt from sketch in the scumblr log (/logs/)? Are there any error messages?
siggy86 commented 9 years ago

Sketchy is working perfectly, its also running through nginx via ssl on port 443 in config/initializers/scumblr.rb its told 443.

config/initializers/scumblr.rb:

Scumblr::Application.configure do
  # Should Scumblr automatically generate screenshots for new results
   config.sketchy_url = "https://localhost:443/api/v1.0/capture"
   config.sketchy_use_ssl = true  # Does sketchy use ssl?
   config.sketchy_verify_ssl = false # Should scumblr verify sketchy's cert
   config.sketchy_tag_status_code = true # Add a tag indicating last status code sketchy received
  # config.sketchy_access_token = "" # Access token for sketchy
  1. Yes, I can login, run searches, etc.
  2. I can download the images and scrapes if I use 443 or HTTP and port 8000 but not with HTTPS and 8000.
  3. I see POST with multiple retries in log/development.log log/development.log:
Started POST "/results/20/update_screenshot" for ::1 at 2015-02-10 16:42:28 -0700
Processing by ResultsController#update_screenshot as TEXT
  Parameters: {"job_status"=>"COMPLETED", "retry"=>0, "sketch_url"=>"https://127.0.0.1:8000/files/www.pcworld.com_13.png", "capture_status"=>"LOCAL_CAPTURES_CREATED", "url"=>"http://www.pcworld.com/article/2599460/netflix-open-sources-internal-threat-monitoring-tools.html", "created_at"=>"2015-02-10 16:42:11.303504", "modified_at"=>"2015-02-10 16:42:27.919589", "html_url"=>"https://127.0.0.1:8000/files/www.pcworld.com_13.html", "scrape_url"=>"https://127.0.0.1:8000/files/www.pcworld.com_13.txt", "callback"=>"http://localhost:3000/results/20/update_screenshot", "url_response_code"=>200, "id"=>"20", "result"=>{"id"=>"20", "url"=>"http://www.pcworld.com/article/2599460/netflix-open-sources-internal-threat-monitoring-tools.html", "created_at"=>"2015-02-10 16:42:11.303504"}}
  Result Load (0.1ms)  SELECT "results".* FROM "results" WHERE "results"."id" = ? LIMIT 1  [["id", "20"]]
  ResultFlag Load (0.1ms)  SELECT "result_flags".* FROM "result_flags" WHERE "result_flags"."result_id" IN (20)
Error adding screenshot
SSL_connect SYSCALL returned=5 errno=0 state=unknown state
["/usr/lib/ruby/1.9.1/net/http.rb:800:in `connect'", "/usr/lib/ruby/1.9.1/net/http.rb:800:in `block in connect'", "/usr/lib/ruby/1.9.1/timeout.rb:55:in `timeout'", "/usr/lib/ruby/1.9.1/timeout.rb:100:in `timeout'", "/usr/lib/ruby/1.9.1/net/http.rb:800:in `connect'", "/usr/lib/ruby/1.9.1/net/http.rb:756:in `do_start'", "/usr/lib/ruby/1.9.1/net/http.rb:745:in `start'", "/usr/lib/ruby/1.9.1/open-uri.rb:306:in `open_http'", "/usr/lib/ruby/1.9.1/open-uri.rb:775:in `buffer_open'", "/usr/lib/ruby/1.9.1/open-uri.rb:203:in `block in open_loop'", "/usr/lib/ruby/1.9.1/open-uri.rb:201:in `catch'", "/usr/lib/ruby/1.9.1/open-uri.rb:201:in `open_loop'", "/usr/lib/ruby/1.9.1/open-uri.rb:146:in `open_uri'", "/usr/lib/ruby/1.9.1/open-uri.rb:677:in `open'", "/usr/lib/ruby/1.9.1/open-uri.rb:29:in `open'", "/home/rails/Scumblr/app/controllers/results_controller.rb:454:in `update_screenshot'", "/var/lib/gems/1.9.1/gems/actionpack-4.0.9/lib/action_controller/metal/implicit_render.rb:4:in `send_action'", "/var/lib/gems/1.9.1/gems/actionpack-4.0.9/lib/abstract_controller/base.rb:189:in `process_action'", "/var/lib/gems/1.9.1/gems/actionpack-4.0.9/lib/action_controller/metal/rendering.rb:10:in `process_action'", "/var/lib/gems/1.9.1/gems/actionpack-4.0.9/lib/abstract_controller/callbacks.rb:18:in `block in process_action'", "/var/lib/gems/1.9.1/gems/activesupport-4.0.9/lib/active_support/callbacks.rb:453:in `_run__4399630651260003070__process_action__callbacks'", "/var/lib/gems/1.9.1/gems/activesupport-4.0.9/lib/active_support/callbacks.rb:80:in `run_callbacks'", "/var/lib/gems/1.9.1/gems/actionpack-4.0.9/lib/abstract_controller/callbacks.rb:17:in `process_action'", "/var/lib/gems/1.9.1/gems/actionpack-4.0.9/lib/action_controller/metal/rescue.rb:29:in `process_action'", "/var/lib/gems/1.9.1/gems/actionpack-4.0.9/lib/action_controller/metal/instrumentation.rb:31:in `block in process_action'", "/var/lib/gems/1.9.1/gems/activesupport-4.0.9/lib/active_support/notifications.rb:159:in `block in instrument'", "/var/lib/gems/1.9.1/gems/activesupport-4.0.9/lib/active_support/notifications/instrumenter.rb:20:in `instrument'", "/var/lib/gems/1.9.1/gems/activesupport-4.0.9/lib/active_support/notifications.rb:159:in `instrument'", "/var/lib/gems/1.9.1/gems/actionpack-4.0.9/lib/action_controller/metal/instrumentation.rb:30:in `process_action'", "/var/lib/gems/1.9.1/gems/actionpack-4.0.9/lib/action_controller/metal/params_wrapper.rb:250:in `process_action'", "/var/lib/gems/1.9.1/gems/activerecord-4.0.9/lib/active_record/railties/controller_runtime.rb:18:in `process_action'", "/var/lib/gems/1.9.1/gems/actionpack-4.0.9/lib/abstract_controller/base.rb:136:in `process'", "/var/lib/gems/1.9.1/gems/actionpack-4.0.9/lib/abstract_controller/rendering.rb:44:in `process'", "/var/lib/gems/1.9.1/gems/actionpack-4.0.9/lib/action_controller/metal.rb:195:in `dispatch'", "/var/lib/gems/1.9.1/gems/actionpack-4.0.9/lib/action_controller/metal/rack_delegation.rb:13:in `dispatch'", "/var/lib/gems/1.9.1/gems/actionpack-4.0.9/lib/action_controller/metal.rb:231:in `block in action'", "/var/lib/gems/1.9.1/gems/actionpack-4.0.9/lib/action_dispatch/routing/route_set.rb:82:in `call'", "/var/lib/gems/1.9.1/gems/actionpack-4.0.9/lib/action_dispatch/routing/route_set.rb:82:in `dispatch'", "/var/lib/gems/1.9.1/gems/actionpack-4.0.9/lib/action_dispatch/routing/route_set.rb:50:in `call'", "/var/lib/gems/1.9.1/gems/actionpack-4.0.9/lib/action_dispatch/journey/router.rb:71:in `block in call'", "/var/lib/gems/1.9.1/gems/actionpack-4.0.9/lib/action_dispatch/journey/router.rb:59:in `each'", "/var/lib/gems/1.9.1/gems/actionpack-4.0.9/lib/action_dispatch/journey/router.rb:59:in `call'", "/var/lib/gems/1.9.1/gems/actionpack-4.0.9/lib/action_dispatch/routing/route_set.rb:676:in `call'", "/var/lib/gems/1.9.1/gems/bullet-4.13.1/lib/bullet/rack.rb:12:in `call'", "/var/lib/gems/1.9.1/gems/meta_request-0.3.4/lib/meta_request/middlewares/app_request_handler.rb:13:in `call'", "/var/lib/gems/1.9.1/gems/meta_request-0.3.4/lib/meta_request/middlewares/meta_request_handler.rb:13:in `call'", "/var/lib/gems/1.9.1/gems/warden-1.2.3/lib/warden/manager.rb:35:in `block in call'", "/var/lib/gems/1.9.1/gems/warden-1.2.3/lib/warden/manager.rb:34:in `catch'", "/var/lib/gems/1.9.1/gems/warden-1.2.3/lib/warden/manager.rb:34:in `call'", "/usr/lib/ruby/vendor_ruby/rack/etag.rb:23:in `call'", "/usr/lib/ruby/vendor_ruby/rack/conditionalget.rb:35:in `call'", "/usr/lib/ruby/vendor_ruby/rack/head.rb:11:in `call'", "/var/lib/gems/1.9.1/gems/actionpack-4.0.9/lib/action_dispatch/middleware/params_parser.rb:27:in `call'", "/var/lib/gems/1.9.1/gems/actionpack-4.0.9/lib/action_dispatch/middleware/flash.rb:241:in `call'", "/usr/lib/ruby/vendor_ruby/rack/session/abstract/id.rb:225:in `context'", "/usr/lib/ruby/vendor_ruby/rack/session/abstract/id.rb:220:in `call'", "/var/lib/gems/1.9.1/gems/actionpack-4.0.9/lib/action_dispatch/middleware/cookies.rb:486:in `call'", "/var/lib/gems/1.9.1/gems/activerecord-4.0.9/lib/active_record/query_cache.rb:36:in `call'", "/var/lib/gems/1.9.1/gems/activerecord-4.0.9/lib/active_record/connection_adapters/abstract/connection_pool.rb:626:in `call'", "/var/lib/gems/1.9.1/gems/actionpack-4.0.9/lib/action_dispatch/middleware/callbacks.rb:29:in `block in call'", "/var/lib/gems/1.9.1/gems/activesupport-4.0.9/lib/active_support/callbacks.rb:373:in `_run__3549210540477604604__call__callbacks'", "/var/lib/gems/1.9.1/gems/activesupport-4.0.9/lib/active_support/callbacks.rb:80:in `run_callbacks'", "/var/lib/gems/1.9.1/gems/actionpack-4.0.9/lib/action_dispatch/middleware/callbacks.rb:27:in `call'", "/var/lib/gems/1.9.1/gems/actionpack-4.0.9/lib/action_dispatch/middleware/reloader.rb:64:in `call'", "/var/lib/gems/1.9.1/gems/actionpack-4.0.9/lib/action_dispatch/middleware/remote_ip.rb:76:in `call'", "/var/lib/gems/1.9.1/gems/better_errors-1.1.0/lib/better_errors/middleware.rb:84:in `protected_app_call'", "/var/lib/gems/1.9.1/gems/better_errors-1.1.0/lib/better_errors/middleware.rb:79:in `better_errors_call'", "/var/lib/gems/1.9.1/gems/better_errors-1.1.0/lib/better_errors/middleware.rb:56:in `call'", "/var/lib/gems/1.9.1/gems/rack-contrib-1.1.0/lib/rack/contrib/response_headers.rb:17:in `call'", "/var/lib/gems/1.9.1/gems/meta_request-0.3.4/lib/meta_request/middlewares/headers.rb:16:in `call'", "/var/lib/gems/1.9.1/gems/actionpack-4.0.9/lib/action_dispatch/middleware/debug_exceptions.rb:17:in `call'", "/var/lib/gems/1.9.1/gems/actionpack-4.0.9/lib/action_dispatch/middleware/show_exceptions.rb:30:in `call'", "/var/lib/gems/1.9.1/gems/railties-4.0.9/lib/rails/rack/logger.rb:38:in `call_app'", "/var/lib/gems/1.9.1/gems/railties-4.0.9/lib/rails/rack/logger.rb:20:in `block in call'", "/var/lib/gems/1.9.1/gems/activesupport-4.0.9/lib/active_support/tagged_logging.rb:68:in `block in tagged'", "/var/lib/gems/1.9.1/gems/activesupport-4.0.9/lib/active_support/tagged_logging.rb:26:in `tagged'", "/var/lib/gems/1.9.1/gems/activesupport-4.0.9/lib/active_support/tagged_logging.rb:68:in `tagged'", "/var/lib/gems/1.9.1/gems/railties-4.0.9/lib/rails/rack/logger.rb:20:in `call'", "/var/lib/gems/1.9.1/gems/quiet_assets-1.0.3/lib/quiet_assets.rb:23:in `call_with_quiet_assets'", "/var/lib/gems/1.9.1/gems/actionpack-4.0.9/lib/action_dispatch/middleware/request_id.rb:21:in `call'", "/usr/lib/ruby/vendor_ruby/rack/methodoverride.rb:21:in `call'", "/usr/lib/ruby/vendor_ruby/rack/runtime.rb:17:in `call'", "/var/lib/gems/1.9.1/gems/activesupport-4.0.9/lib/active_support/cache/strategy/local_cache.rb:83:in `call'", "/usr/lib/ruby/vendor_ruby/rack/lock.rb:17:in `call'", "/var/lib/gems/1.9.1/gems/actionpack-4.0.9/lib/action_dispatch/middleware/static.rb:64:in `call'", "/usr/lib/ruby/vendor_ruby/rack/sendfile.rb:112:in `call'", "/var/lib/gems/1.9.1/gems/railties-4.0.9/lib/rails/engine.rb:511:in `call'", "/var/lib/gems/1.9.1/gems/railties-4.0.9/lib/rails/application.rb:97:in `call'", "/var/lib/gems/1.9.1/gems/railties-4.0.9/lib/rails/railtie/configurable.rb:30:in `method_missing'", "/usr/lib/ruby/vendor_ruby/unicorn/http_server.rb:580:in `process_client'", "/usr/lib/ruby/vendor_ruby/unicorn/http_server.rb:660:in `worker_loop'", "/usr/lib/ruby/vendor_ruby/unicorn/http_server.rb:527:in `spawn_missing_workers'", "/usr/lib/ruby/vendor_ruby/unicorn/http_server.rb:538:in `maintain_worker_count'", "/usr/lib/ruby/vendor_ruby/unicorn/http_server.rb:303:in `join'", "/usr/bin/unicorn_rails:209:in `<main>'"]
ahoernecke commented 9 years ago

I think Sketchy may be misconfigured. It looks to me like Sketchy is telling Scumblr to download the images from https over port 8000. If Sketchy is expecting HTTP on port 8000, this would be the issue. I think you may need to configure Sketchy to return the image/scrape URLs as https port 443, or http (no "s") port 8000.

siggy86 commented 9 years ago

After reverting anything in sketchy's config-default.py where it said to change to ssl or https if using nginx and ssl back to the defaults of http its working. Thank you.

jwilczek commented 9 years ago

I appear to be having the same issue. Are you still running SSL for sketchy even though you left the config-default.py as http?

siggy86 commented 9 years ago

No as I said I switched back to regular http since it wouldn't work no matter what I tried.

jwilczek commented 9 years ago

I had it working at one point with https. If I can figure it out again, I'll let you know.

jwilczek commented 9 years ago

I got it working - let me document my solution and I'll post it.

selovitz commented 9 years ago

I am also having a similar issue. Sketchy receives the instructions from Scumblr perfectly and creates the screen shots. Scumblr just never receives the image back from Sketchy...

Scumblr is available on http port 3000 Sketchy on http 8000.

{ "callback": "http://localhost:3000/results/115/update_screenshot", "capture_status": "CALLBACK_SUCCEEDED", "created_at": "2015-05-04 14:55:05.485637", "html_url": "http://127.0.0.1:8000/files/www.ebay.com_4.html", "id": 4, "job_status": "COMPLETED", "modified_at": "2015-05-04 14:55:18.730099", "retry": 0, "scrape_url": "http://127.0.0.1:8000/files/www.ebay.com_4.txt", "sketch_url": "http://127.0.0.1:8000/files/www.ebay.com_4.png", "url": "http://www.ebay.com/itm/Honda-Other-/221757996972", "url_response_code": 200 },

Looking forward to your post @jwilczek

ahoernecke commented 9 years ago

@selovitz,

Is Sketchy running on HTTP or HTTPS on port 8000? Can you post a snippet from your scumblr log for when the callback is being made?

sbehrens commented 9 years ago

Hi,

Make this change to your config-default.py in Sketchy from:

Line 27

BASE_URL = 'http://%s' % os.getenv('host', '127.0.0.1:8000')

TO

Line 27

BASE_URL = 'https://%s' % os.getenv('host', '127.0.0.1:8000')

The BASE_URL needs the correct schema which is HTTPS. You can also set the env variable host with the domain you need.

ahoernecke commented 9 years ago

Closing as resolved. If you're still having issues please feel reopen.

mvert commented 9 years ago

I have set scumblr and sketchy on two different machines. I am attempting to change the callback address sent to sketchy and I can't seem to find the location. The callback looks like this.

"callback": "http://localhost:3000/results/24/update_screenshot", 

I need it to be my IP address vs localhost. Any help would be appreciated.

ahoernecke commented 9 years ago

This should be set in you the appropriate environment file (config/environments/.rb) with the settings below:

Rails.application.routes.default_url_options[:host] = "scumblr.com" # Scumblr host and port if non-standard
Rails.application.routes.default_url_options[:protocol] = "https" # Scumblr protocol
mvert commented 9 years ago

ahoernecke, thanks. I found that if I don't restart sidekiq after refreshing the default_url_options, sidekiq will continue to send as original url. Figured I'd put this up if someone has the same problem, I changed that for an hour before restarting the whole box and realizing I should have restarted sidekiq

ahoernecke commented 9 years ago

Thanks for the additional information. That's a good point, the sidekiq processes/threads won't pick up the new config without a restart.

Thanks!