Netflix-Skunkworks / Scumblr

Web framework that allows performing periodic syncs of data sources and performing analysis on the identified results
Apache License 2.0
2.64k stars 319 forks source link

Having trouble integrating Sketchy #249

Open matt-krehbiel opened 5 years ago

matt-krehbiel commented 5 years ago

First of all, sorry for making yet another one of these. I've noticed a few closed issues already along these lines, but the suggestions in those haven't worked in my case. I'm not quite sure what I'm missing.

I have both installed on the same box, and both are listening on their default ports of 3000 and 8000. Both Scumblr and Sketchy work individually; I can task searches and generate results with Scumblr and generate screenshots with Sketchy. I have it configured correctly enough that Sketchy gets the request from Scumblr, as I can follow the Sketchy ID to get the capture information:

{ "callback": "http://127.0.0.1:3000/results/63/update_screenshot", "capture_status": "LOCAL_CAPTURES_CREATED", "created_at": "2018-10-05 15:20:37.728428", "html_url": "http://127.0.0.1:8000/files/pastebin.com_33.html", "id": 33, "job_status": "COMPLETED", "modified_at": "2018-10-05 15:20:59.791822", "retry": 0, "scrape_url": "http://127.0.0.1:8000/files/pastebin.com_33.txt", "sketch_url": "http://127.0.0.1:8000/files/pastebin.com_33.png", "status_only": false, "url": "https://pastebin.com/*ACTUAL_URL_HERE*", "url_response_code": 200 }

If I follow that callback link, I get a page that displays "OK". However, there are no attachments on the page for the Scumblr result. Here are the relevant parts from my config files (if I'm missing any, let me know):

Sketchy's config-default.py: BASE_URL = 'http://%s' % os.getenv('host', '127.0.0.1:8000')

Scumblr's config/environments/development.rb: Rails.application.routes.default_url_options[:host] = "http://127.0.0.1:3000"

Scumblr's config/environments/production.rb: Rails.application.routes.default_url_options[:host] = "*Box's public IP*:3000" Rails.application.routes.default_url_options[:protocol] = "https"

Scumblr's config/initializers/scumblr.rb: config.sketchy_url = "http://127.0.0.1:8000/api/v1.0/capture" config.sketchy_use_ssl = false

I know I'm missing or misconfiguring something, but I can't figure out what. Thanks in advance!

sbehrens commented 5 years ago

try switching from localhost to a routable ip address

matt-krehbiel commented 5 years ago

Hey, I ended up fixing it over the weekend, but wanted to nail down exactly what did it before I posted back here. In scumblr.rb, removing the port and changing the protocol to https for config.sketchy_url, and changing default_url_options in development.rb to a routable IP address seems to have fixed it.

Edit: Ah, and I forgot, BASE_URL in Sketchy's config-default.py needed to be a routable IP address as well.