osscameroon / js-generator

Generates JavaScript from HTML
https://osscameroon.github.io/js-generator/
MIT License
19 stars 14 forks source link

Character Encoding issue #238

Open FanJups opened 1 year ago

FanJups commented 1 year ago

While working on PR #237 to solve #217 , I used this symbol "©" inside my html input file but one test failed because of this character then I started thinking about a way to solve that issue.

We should find a way to add encoding utf8 to the InputStream at the core level not when doing tests. Now, we add encoding when doing tests, we should correct that.

https://stackoverflow.com/questions/5928046/spring-mvc-utf-8-encoding

https://stackoverflow.com/questions/5649329/utf-8-encoding-problem-in-spring-mvc

https://stackoverflow.com/questions/29434896/how-to-deal-with-java-encoding-problems-especially-xml

https://www.baeldung.com/java-char-encoding

DOES "multipart/form-data handle utf8 encoding ?

https://www.google.com/search?q=DOES+%22multipart%2Fform-data+handle+utf8+encoding+%3F&rlz=1C1FCXM_pt-PTPT1032PT1032&oq=DOES+%22multipart%2Fform-data+handle+utf8+encoding+%3F&aqs=chrome..69i57j33i22i29i30.19637j1j7&sourceid=chrome&ie=UTF-8

https://issues.redhat.com/browse/RESTEASY-390

https://stackoverflow.com/questions/546365/utf-8-text-is-garbled-when-form-is-posted-as-multipart-form-data

java define encoding of InputStream

https://www.google.com/search?q=java+define+encoding+of+InputStream&rlz=1C1FCXM_pt-PTPT1032PT1032&oq=java+define+encoding+of+InputStream&aqs=chrome..69i57j33i160l2j33i22i29i30l4.19441j1j7&sourceid=chrome&ie=UTF-8

https://stackoverflow.com/questions/3043710/java-inputstream-encoding-charset

FanJups commented 1 year ago

set the charset of bytes java

https://www.google.com/search?q=set+the+charset+of+bytes+java&rlz=1C1FCXM_pt-PTPT1032PT1032&oq=set+the+charset+of+bytes+java&aqs=chrome..69i57j33i160j33i22i29i30.22275j1j7&sourceid=chrome&ie=UTF-8

https://stackoverflow.com/questions/88838/how-to-convert-strings-to-and-from-utf8-byte-arrays-in-java

//Convert from String to byte[]:

String s = "some text here";
byte[] b = s.getBytes(StandardCharsets.UTF_8);

//Convert from byte[] to String:

byte[] b = {(byte) 99, (byte)97, (byte)116};
String s = new String(b, StandardCharsets.US_ASCII);
FanJups commented 1 year ago

This issue is also related to non ascii characters:

https://www.baeldung.com/java-char-encoding https://stackoverflow.com/questions/41690641/non-ascii-value-symbols-not-getting-printed https://docs.oracle.com/javase/8/docs/api/java/text/Normalizer.html https://www.tabnine.com/code/java/classes/java.text.Normalizer https://www.educative.io/answers/what-is-stringutilsisasciiprintable-in-java https://stackoverflow.com/questions/30111273/how-do-i-remove-copyright-and-other-non-ascii-characters-from-my-java-string https://github.com/google/guava/wiki/StringsExplained https://stackoverflow.com/questions/54752377/handling-strings-with-special-characters-in-java