metafacture / metafacture-core

Core package of the Metafacture tool suite for metadata processing.
https://metafacture.org
Apache License 2.0
69 stars 34 forks source link

Opens gzip compressed content #513

Closed dr0i closed 6 months ago

dr0i commented 6 months ago

Resolves #511.

dr0i commented 6 months ago

Unfortunately HttpURLConnection only follows redirects if they use the same protocol. As this was the initial issue (unable to open http://schema.org/) a custom implementation (or other library) was needed.

Thx for the many improvements/rewritings. Also agreed on multiple commits vs one big commit. re "major release" : opened https://github.com/metafacture/metafacture-core/issues/515.

blackwinter commented 5 months ago

As this was the initial issue (unable to open http://schema.org/) a custom implementation (or other library) was needed.

Unfortunately, this motivation wasn't made explicit. Both this pull request and the referenced issue are primarily concerned with compression. Wouldn't it have been possible to adjust the URL being opened client-side?

Unfortunately HttpURLConnection only follows redirects if they use the same protocol.

Is this custom implementation affected by the same concerns then? ("serious security consequences")

And what about the objections against URLDecoder.decode in the provided workaround? (which seems to have been the basis for your implementation)

dr0i commented 5 months ago

Sorry to not have been explicit in the issue. I will remember to do better.

"serious security consequences": java network developers decided to treat protocol switches as a possible threat. We don't.

Re objections: you mean https://stackoverflow.com/questions/1884230/httpurlconnection-doesnt-follow-redirect-from-http-to-https/26046079#comment105077439_26046079 ? Is the location header really needed? There is no test for it, so maybe not. If it brings trouble we also could wait for a report here and then fix it.

blackwinter commented 5 months ago

java network developers decided to treat protocol switches as a possible threat. We don't.

Was this discussed anywhere?

Is the location header really needed?

Of course it's needed, that's where the redirect URL comes from. The objection was against decoding that URL.

There is no test for it, so maybe not.

That's because you only added a test for the compression handling, not for the redirect logic. I missed it in my review.