bottomless-archive-project / library-of-alexandria

Library of Alexandria (LoA in short) is a project that aims to collect and archive documents from the internet.
MIT License
110 stars 2 forks source link

Disable HTTPS/SSL host verification #441

Closed laxika closed 2 years ago

laxika commented 2 years ago

Some URLs return this exception even when the content there is downloadable (albeit behind an expired SSL certification):

2022-08-03 11:14:29.488 ERROR 9652 --- [pool-1-thread-1] c.g.b.l.u.s.d.FileDownloadManager        : Error while downloading document form http://1.flcgil.stgy.it/files/pdf/20201001/nota-17377-del-28-settembre-2020-snv-indicazioni-operative-documenti-strategici-scuole.pdf.

javax.net.ssl.SSLPeerUnverifiedException: Hostname 1.flcgil.stgy.it not verified:
    certificate: sha256/tuFuC8smRNwJ9p9NR4+QlDotSV6IoQlmLFTrh3ihErA=
    DN: CN=flcgil.it
    subjectAltNames: [admin.congresso2014.flcgil.it, admin.congresso2018.flcgil.it, admin.flcgil.it, cnr.flcgil.it, congresso.flcgil.it, congresso2014.flcgil.it, congresso2018.flcgil.it, enea.flcgil.it, flcgil.it, flcgil.stgy.it, iscriviti.flcgil.it, istat.flcgil.it, m.congresso2014.flcgil.it, m.congresso2018.flcgil.it, m.flcgil.it, oraesempreconoscenza.flcgil.it, plist.flcgil.it, servizi.flcgil.it, www.flcgil.it]
    at okhttp3.internal.connection.RealConnection.connectTls(RealConnection.kt:389) ~[okhttp-4.10.0.jar:na]
    at okhttp3.internal.connection.RealConnection.establishProtocol(RealConnection.kt:337) ~[okhttp-4.10.0.jar:na]
    at okhttp3.internal.connection.RealConnection.connect(RealConnection.kt:209) ~[okhttp-4.10.0.jar:na]
    at okhttp3.internal.connection.ExchangeFinder.findConnection(ExchangeFinder.kt:226) ~[okhttp-4.10.0.jar:na]
    at okhttp3.internal.connection.ExchangeFinder.findHealthyConnection(ExchangeFinder.kt:106) ~[okhttp-4.10.0.jar:na]
    at okhttp3.internal.connection.ExchangeFinder.find(ExchangeFinder.kt:74) ~[okhttp-4.10.0.jar:na]
    at okhttp3.internal.connection.RealCall.initExchange$okhttp(RealCall.kt:255) ~[okhttp-4.10.0.jar:na]
    at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.kt:32) ~[okhttp-4.10.0.jar:na]
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109) ~[okhttp-4.10.0.jar:na]
    at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.kt:95) ~[okhttp-4.10.0.jar:na]
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109) ~[okhttp-4.10.0.jar:na]
    at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.kt:83) ~[okhttp-4.10.0.jar:na]
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109) ~[okhttp-4.10.0.jar:na]
    at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.kt:76) ~[okhttp-4.10.0.jar:na]
    at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109) ~[okhttp-4.10.0.jar:na]
    at okhttp3.internal.connection.RealCall.getResponseWithInterceptorChain$okhttp(RealCall.kt:201) ~[okhttp-4.10.0.jar:na]
    at okhttp3.internal.connection.RealCall.execute(RealCall.kt:154) ~[okhttp-4.10.0.jar:na]
    at com.github.bottomlessarchive.loa.url.service.downloader.FileDownloadManager.downloadFile(FileDownloadManager.java:44) ~[main/:na]
    at com.github.bottomlessarchive.loa.downloader.service.file.FileCollector.acquireFile(FileCollector.java:33) ~[main/:na]
    at com.github.bottomlessarchive.loa.downloader.service.document.DocumentLocationProcessor.doProcessDocumentLocation(DocumentLocationProcessor.java:83) ~[main/:na]
    at com.github.bottomlessarchive.loa.downloader.service.document.DocumentLocationProcessor.lambda$processDocumentLocation$0(DocumentLocationProcessor.java:56) ~[main/:na]
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[na:na]
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[na:na]
    at java.base/java.lang.Thread.run(Thread.java:833) ~[na:na]

As we are making harmless GET requests with no particular sensitive information we shouldn't care much about the SSL certificates (we just want the data).

laxika commented 2 years ago

https://stackoverflow.com/questions/34671926/sslpeerunverifiedexception-okhttp