spakin / SimpInkScr

Simple Inkscape Scripting
https://inkscape.org/~pakin/%E2%98%85simple-inkscape-scripting
GNU General Public License v3.0
320 stars 31 forks source link

UTF-8 characters in the script will not work. #75

Closed capvor closed 1 year ago

capvor commented 1 year ago

UTF-8 characters in the script will report an error, even if it is just a comment.

image

image

capvor commented 1 year ago

This is because the file encoding is not specified when opening the file in Simple Inkscape Scripting. If no encoding is specified, the open function will use the system encoding by default. UTF-8 encoding facilitates communication between different languages and is widely used among programmers. Therefore, it is recommended to open files with UTF-8 encoding.

image

spakin commented 1 year ago

I worry that assuming all input files are UTF-8 will fix your use case but break someone else's. One thing I can do, however, is add a field to the dialog box that lets the user specify explicitly the input-file encoding. That should make everyone happy.

As I don't have convenient access to a Windows system, I can use your help with one thing: Does the Python Code field in Simple Inkscape Scripting's dialog box work as expected? I'd like to know if I need somehow to re-encode directly entered Python code or if it suffices to specify the encoding only when reading from a file.

capvor commented 1 year ago

Thanks for your reply. I think it's good to use UTF-8 encoding for the code entered directly in the dialog box.

For the python source file, I think it would be better to re-decode the python source by special comment line of python source file. After all, this point is officially stipulated by python, and everyone should abide by this rule.

spakin commented 1 year ago

The special comment line sounds like a good idea, but I can't easily test whether or not it's required. I think specifying a non-default encoding is critical for scripts read from a file but useless for scripts entered directly into the dialog box. From what I can tell, the latter appear to be converted automatically to UTF-8.

I made a temporary branch of Simple Inkscape Scripting called "encoding". Could you please switch to that and test your UTF-8-encoded script on Windows

  1. as an external Python file using the default encoding (hypothesis: failure),
  2. as an external Python file selecting UTF-8 encoding (hypothesis: success),
  3. as directly entered code using the default encoding (hypothesis: success), and
  4. as directly entered code selecting UTF-8 encoding (hypothesis: success)?

I'm eager to learn what you observe.

capvor commented 1 year ago

Yeah, I have tested the following case:

  1. as an external Python file which is UTF-8 encoding using the default encoding (test result: failure),
  2. as an external Python file which is UTF-8 encoding selecting UTF-8 encoding (test result: success),
  3. as directly entered code using the default encoding (test result: success),
  4. as directly entered code selecting UTF-8 encoding (test result: success),
  5. as an external Python file which is default encoding on my OS using the default encoding (test result: success),
  6. as an external Python file which is default encoding on my OS using the UTF-8 encoding (test result: failure),

According to the test results, all of these are as expected.

spakin commented 1 year ago

Excellent! Thanks for all the testing. I'll soon merge encoding into master, push that out to GitHub, and close this issue.