Closed mikegerber closed 3 years ago
Example http_proxy
variable:
$ env | grep http_proxy
http_proxy=http://http-proxy.sbb.spk-berlin.de:3128/
I should mention that I know from PAGE Viewer that setting these command line parameters solves the issue, so I strongly suspect it will also fix it for this problem with PAGE Converter.
After some research I found "the Java enterprise solution ™️", i.e. setting another env variable:
export JAVA_TOOL_OPTIONS="-Dhttp.proxyHost=http-proxy.sbb.spk-berlin.de -Dhttp.proxyPort=3128"
So I'm closing this issue! 😀
Thanks for examining this issue. I think it would be good to document the solution here in the README and also on some suitable place for OCR-D. CC'ing @kba for his opinion.
Ideally the software would give a user friendly error message for connection errors and suggest typical solutions (or link to a page with such hints).
Thanks for examining this issue. I think it would be good to document the solution here in the README and also on some suitable place for OCR-D. CC'ing @kba for his opinion.
I agree. It's not the first time I looked for a solution for the "Java vs. HTTP proxy problem", and it's relatively hard to find documentation of that JAVA_TOOL_OPTIONS
solution.
I've opened https://github.com/OCR-D/ocrd_fileformat/issues/32 to address our immediate need of having an offline conversion of PAGE → ALTO using ocrd_fileformat, but it is also conceivable to implement the solution here
When a HTTP proxy is needed, conversion from PAGE to ALTO is failing:
Unfortunately with the network setup here, this also is a long wait for a connection error because packets are simply dropped...
The preferred solution for me would be that ocr-fileformat would parse the somewhat standard
http_proxy
environment variable and passes the correct parameter tojava
: