When you POST a request with only the URL parameter. The response is UTF-8 friendly.
When I use the html parameter, response should be utf-8 friendly too.
The API should return a title like this : "Le démantèlement des réacteurs nucléaires, véritable filière industrielle"
And content like this :
... <p><strong>Dans les prochaines années, avec la transition énergétique et le démantèlement ...
Current Behavior
Title returned : "Le d�mant�lement des r�acteurs nucl�aires, v�ritable fili�re industrielle"
Content returned:
...<p><strong>Dans les prochaines ann**�**es, avec la transition �nerg�tique et le d�mant�lement ...
Steps to Reproduce
I just do a POST to the parse-html endpoint
{ "url": "https://www.europeanscientist.com/fr/energie/demantelement-reacteurs-nucleaires-dechets-pngmdr/", "html" : [copy_paste_of_html_code] }
Possible Solution
I tried to force header's request Content-type to utf-8 with application/json; charset=utf-8 but it doesn't change the result.
While running this request locally, I've got an Iconv-lite deprecation warning related to encoding
Iconv-lite warning: decode()-ing strings is deprecated. Refer to https://github.com/ashtuchkin/iconv-lite/wiki/Use-Buffers-when-decoding
Expected Behavior
When you POST a request with only the URL parameter. The response is UTF-8 friendly. When I use the html parameter, response should be utf-8 friendly too.
The API should return a title like this : "Le démantèlement des réacteurs nucléaires, véritable filière industrielle" And content like this :
... <p><strong>Dans les prochaines années, avec la transition énergétique et le démantèlement ...
Current Behavior
Title returned : "Le d�mant�lement des r�acteurs nucl�aires, v�ritable fili�re industrielle" Content returned:
...<p><strong>Dans les prochaines ann**�**es, avec la transition �nerg�tique et le d�mant�lement ...
Steps to Reproduce
I just do a POST to the parse-html endpoint
{ "url": "https://www.europeanscientist.com/fr/energie/demantelement-reacteurs-nucleaires-dechets-pngmdr/", "html" : [copy_paste_of_html_code] }
Possible Solution
I tried to force header's request Content-type to utf-8 with
application/json; charset=utf-8
but it doesn't change the result. While running this request locally, I've got an Iconv-lite deprecation warning related to encodingIconv-lite warning: decode()-ing strings is deprecated. Refer to https://github.com/ashtuchkin/iconv-lite/wiki/Use-Buffers-when-decoding