VIDA-NYU / domain_discovery_tool

This repository contains the Domain Discovery Tool (DDT) project. DDT is an interactive system that helps users explore and better understand a domain (or topic) as it is represented on the Web.
http://domain-discovery-tool.readthedocs.io/en/latest/index.html
GNU General Public License v3.0
47 stars 18 forks source link

seedfinder: cannot switch search engine and DDT crash #40

Closed julianafreire closed 7 years ago

julianafreire commented 7 years ago

I tried to use the seed finder with the keywords: political news Then DDT said: Query failed. Try Bing.

But under the SeedFinder, there is no option to switch the search engine.

I went back to the Search Tab and selected Bing. Then I re-started the SeedFinder and DDT crashed. Log attached below.

172.17.0.1 - - [13/Jun/2017:18:37:58] "POST /updateOnlineClassifier HTTP/1.1" 200 25 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:38:06] "POST /getPages HTTP/1.1" 200 1192 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" Using default negative tags 172.17.0.1 - - [13/Jun/2017:18:38:57] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:38:58] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:38:59] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" Seeds path /ddt/domain_discovery_tool_react/server/data/political_news/seeds.txt EXCEPTION IN (/ddt/domain_discovery_API/models/domain_discovery_model.py, LINE 1729 "with open(file_positive, 'w') as f:"): [Errno 36] File name too long: u'/ddt/domain_discovery_tool_react/server/data/political_news/training_data/positive/http%3A%2F%2Fnytimes.com%2F2015%2F03%2F17%2Fnyregion%2Fnight-of-drug-overdoses-jolts-wesleyans-liberal-tradition.html%3Faction%3Dclick%26contentCollection%3DOpinion%26module%3DMostEmailed%26version%3DFull%26region%3DMarginalia%26src%3Dme%26pgtype%3Darticle' EXCEPTION IN (/ddt/domain_discovery_API/models/domain_discovery_model.py, LINE 1729 "with open(file_positive, 'w') as f:"): [Errno 36] File name too long: u'/ddt/domain_discovery_tool_react/server/data/political_news/training_data/positive/http%3A%2F%2Fnytimes.com%2F2014%2F08%2F16%2Fupshot%2Fmapping-migration-in-the-united-states-since-1900.html%3FWT.mc_id%3D2015-Q1-KWP-AUD_DEV-0101-0331%26WT.mc_ev%3Dclick%26bicmp%3DAD%26bicmlukp%3DWT.mc_id%26bicmst%3D1420088400%26bicmet%3D1451624400%26ad-keywords%3DAUDDEVMAR%26kwp_0%3D10713%26kwp_4%3D78883%26kwp_1%3D125942' EXCEPTION IN (/ddt/domain_discovery_API/models/domain_discovery_model.py, LINE 1729 "with open(file_positive, 'w') as f:"): [Errno 36] File name too long: u'/ddt/domain_discovery_tool_react/server/data/political_news/training_data/positive/http%3A%2F%2Fnytimes.com%2F2014%2F08%2F16%2Fupshot%2Fmapping-migration-in-the-united-states-since-1900.html%3FWT.mc_id%3D2015-Q1-KWP-AUD_DEV-0101-0331%26WT.mc_ev%3Dclick%26bicmp%3DAD%26bicmlukp%3DWT.mc_id%26bicmst%3D1420088400%26bicmet%3D1451624400%26ad-keywords%3DAUDDEVMAR%26kwp_0%3D10713%26kwp_4%3D78883%26kwp_1%3D125942%26_r%3D0' EXCEPTION IN (/ddt/domain_discovery_API/models/domain_discovery_model.py, LINE 1729 "with open(file_positive, 'w') as f:"): [Errno 36] File name too long: u'/ddt/domain_discovery_tool_react/server/data/political_news/training_data/positive/http%3A%2F%2Fnytimes.com%2F2014%2F08%2F16%2Fupshot%2Fmapping-migration-in-the-united-states-since-1900.html%3FWT.mc_id%3D2015-Q1-KWP-AUD_DEV-0101-0331%26WT.mc_ev%3Dclick%26bicmp%3DAD%26bicmlukp%3DWT.mc_id%26bicmst%3D1420088400%26bicmet%3D1451624400%26ad-keywords%3DAUDDEVMAR%26kwp_0%3D10713%26kwp_4%3D78883%26kwp_1%3D125942%26_r%3D0%26abt%3D0002%26abg%3D1' 172.17.0.1 - - [13/Jun/2017:18:38:59] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:38:59] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:00] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:00] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36"

ACHE Crawler 0.9.0-SNAPSHOT

Preparing training data... POSITIVE:408 NEGATIVE:2 172.17.0.1 - - [13/Jun/2017:18:39:01] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:01] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:02] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:02] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:03] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:03] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:04] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:04] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:05] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:05] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:06] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:06] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:07] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:07] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:08] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:08] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:09] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:09] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:10] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:10] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:11] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:11] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:12] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:12] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:13] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:13] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:14] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:14] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" Training model... Training SMO model... 172.17.0.1 - - [13/Jun/2017:18:39:15] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:15] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:16] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:16] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:17] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:17] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36"

Options: -M -C 0.01

SMO

Kernel used: Linear Kernel: K(x,y) = <x,y>

Classifier for classes: CLASS_0, CLASS_1

BinarySMO

Machine linear: showing attribute weights, not support vectors.

    -0.0074 * (normalized) new

Number of kernel evaluations: 22213 (72.492% cached)

Logistic Regression with ridge parameter of 1.0E-8 Coefficients... Class Variable CLASS_0

pred -131.7197 Intercept -110.8882

Odds Ratios... Class Variable CLASS_0

pred 0

Time taken to build model: 0.47 seconds Time taken to test model on training data: 0.14 seconds

=== Error on training data ===

Correctly Classified Instances 410 100 % Incorrectly Classified Instances 0 0 % Kappa statistic 1
Mean absolute error 0
Root mean squared error 0
Relative absolute error 0 % Root relative squared error 0 % Total Number of Instances 410

=== Confusion Matrix ===

a b <-- classified as 408 0 | a = CLASS_0 0 2 | b = CLASS_1

=== Stratified cross-validation ===

Correctly Classified Instances 406 99.0244 % Incorrectly Classified Instances 4 0.9756 % Kappa statistic -0.0049 Mean absolute error 0.0104 Root mean squared error 0.0982 Relative absolute error 84.1457 % Root relative squared error 140.5804 % Total Number of Instances 410

=== Confusion Matrix ===

a b <-- classified as 406 2 | a = CLASS_0 2 0 | b = CLASS_1

Creating feature file... done. None

RUN SEED FINDER politics news

EXEC SEED FINDER172.17.0.1 - - [13/Jun/2017:18:39:17] "POST /runSeedFinder HTTP/1.1" 200 9 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" politics news /ddt/domain_discovery_tool_react/server

COLLECT SEED URLS politics news /ddt/domain_discovery_tool_react/server/data/political_news/seedFinder/politics_news_results.csv 172.17.0.1 - - [13/Jun/2017:18:39:18] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:18] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:19] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:19] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:20] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:20] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:21] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:21] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:22] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:22] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:23] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:23] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:24] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:24] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:25] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:25] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:26] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 172.17.0.1 - - [13/Jun/2017:18:39:26] "POST /getPages HTTP/1.1" 200 37 "http://0.0.0.0:8084/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" Jun 13, 2017 6:39:59 PM org.elasticsearch.plugins.PluginsService INFO: [Black Goliath] loaded [], sites [] Exception in thread "pool-1-thread-9" java.lang.IllegalArgumentException: Illegal character in query at index 49: https://twitter.com/foxnewspolitics?ref_src=twsrc^google|twcamp^serp|twgr^author at java.net.URI.create(URI.java:852) at org.apache.http.client.methods.HttpGet.(HttpGet.java:69) at Download_URL.run(Download_URL.java:177) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.net.URISyntaxException: Illegal character in query at index 49: https://twitter.com/foxnewspolitics?ref_src=twsrc^google|twcamp^serp|twgr^author at java.net.URI$Parser.fail(URI.java:2848) at java.net.URI$Parser.checkChars(URI.java:3021) at java.net.URI$Parser.parseHierarchical(URI.java:3111) at java.net.URI$Parser.parse(URI.java:3053) at java.net.URI.(URI.java:588) at java.net.URI.create(URI.java:850) ... 5 more org.apache.http.client.ClientProtocolException: Unexpected response status: 403 at Download_URL.run(Download_URL.java:285) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)

Time Elapsed time for http://www.huffingtonpost.com/section/politics thread = 0.739 secs

Jun 13, 2017 6:40:01 PM org.apache.http.client.protocol.ResponseProcessCookies processCookies WARNING: Invalid cookie header: "Set-Cookie: ABtestingV2=B; expires=Sun, 10 Dec 2017 18:40:01 GMT; path=/;". Invalid 'expires' attribute: Sun, 10 Dec 2017 18:40:01 GMT Jun 13, 2017 6:40:01 PM org.apache.http.client.protocol.ResponseProcessCookies processCookies WARNING: Invalid cookie header: "Set-Cookie: visid_incap_121505=ZPUklx9OR7uiiwijKZemilgxQFkAAAAAQUIPAAAAAADUXHfViB0ZiO5vj2/e69Lk; expires=Wed, 13 Jun 2018 08:02:50 GMT; path=/; Domain=.economist.com". Invalid 'expires' attribute: Wed, 13 Jun 2018 08:02:50 GMT Jun 13, 2017 6:40:01 PM org.apache.http.client.protocol.ResponseProcessCookies processCookies WARNING: Invalid cookie header: "Set-Cookie: ABtestingV2=A; expires=Sun, 10 Dec 2017 18:40:01 GMT; path=/;". Invalid 'expires' attribute: Sun, 10 Dec 2017 18:40:01 GMT /ddt/run_ddt: line 33: 101 Killed python $DDT_HOME/server/server.py Stopping elastisearch container elastic Removing elastisearch container elastic Stopping DD Tool container dd_tool Removing DD Tool container dd_tool Julianas-MacBook-Pro-2:Downloads juliana$

yamsgithub commented 7 years ago

The message shown was incorrect and was fixed. We cannot select the search engine for seed finder. Seed finder process can now be monitored in process monitor.