Closed dr0i closed 6 months ago
Working on it I noticed https://github.com/metafacture/metafacture-core/blob/52e41414fff130c183151dbb64810f259548ddf8/metafacture-io/src/test/java/org/metafacture/io/HttpOpenerTest.java#L259 This seems invalid because:
The content-encoding specifies the data transfer encoding used by the issuer of the content. UTF-8 is not a content encoding, it is a character set. Specifying the character set is done in the content-type header
(https://stackoverflow.com/questions/17154967/is-content-encoding-being-set-to-utf-8-invalid)
[EDIT] there seems a principal misunderstanding of encoding
in HttpOpener
as a synonym for charset
. I Propose to rename variables and methods to charset
, probably mark setEncoding(String)
as deprecated.
HttpOpener
, akadecode-html
when speakingflux
, is not able to decodegzip
ed data atm.This was discovered by @TobiasNx failing to lookup schema.org .