I remember having this issue when I worked for DoD and now at LANL. We had to set a proxy server for all HTTP and HTTPS traffic. Right now it's making it difficult for me to try out WAIL at LANL.
I know Heritrix allows for a Proxy server to be set using the crawler-beans.cxml file because I had to do it when testing out Heritrix at LANL.
The settings are httpProxyHost and httpProxyPort for Heritrix 3.2.0. It seems like the WAIL interface could let the user specify these values, which would then be injected into the crawler-beans.cxml file when each Heritrix crawl job is run.
@shawnmjones thank you for submitting this issue.
I remember(thanks to this submission) talking with you about this at au 2.0.
This is going to be put in the high priority todo list for wail.
I remember having this issue when I worked for DoD and now at LANL. We had to set a proxy server for all HTTP and HTTPS traffic. Right now it's making it difficult for me to try out WAIL at LANL.
I know Heritrix allows for a Proxy server to be set using the crawler-beans.cxml file because I had to do it when testing out Heritrix at LANL.
The settings are httpProxyHost and httpProxyPort for Heritrix 3.2.0. It seems like the WAIL interface could let the user specify these values, which would then be injected into the crawler-beans.cxml file when each Heritrix crawl job is run.