Closed ciroppina closed 5 years ago
Can you curl this site's pages from that host at all?
I just tested and could crawl it without problems. So I think it may be a connectivity issue like suggested by @abolotnov.
Sorry, maybe the pasted image is not visible
ciroBorrelli
Il giorno ven 1 feb 2019 alle ore 19:23 abolotnov notifications@github.com ha scritto:
Can you curl this site's pages from that host at all?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Norconex/collector-http/issues/554#issuecomment-459818996, or mute the thread https://github.com/notifications/unsubscribe-auth/AOzk_RsAQrHhFaBpgUFoeT8fI4_-JCxlks5vJIYMgaJpZM4aeQTJ .
Solved adding some proxy settings
<httpClientFactory> <!-- for proxy settings -->
<proxyHost>proxy.regione.abruzzo.it</proxyHost>
<proxyPort>8080</proxyPort>
<proxyScheme>http</proxyScheme>
</httpClientFactory>
Closing
Dear Sirs,
I am successfully crawling dozen websites with the 2.8.1 Collector-Http, successfully sending/committing contents to my Solr7.5.0 schema
But a (Italian) website always returns Connection Refused at StartUrl - and the collector early terminates My config is the following:
while, the default configuration sections is:
and the log says:
unified_95_PRL_32_HTTP_32_Collector.log