stlehmann / pyads

Python wrapper for TwinCAT ADS
MIT License
252 stars 93 forks source link

Use cp1252 as default encoding for Strings #300

Open hkalteBr opened 2 years ago

hkalteBr commented 2 years ago

For Strings with special characters the length needs to be of the utf-8 encoded value or else there's an 1797 ADS Error.

stlehmann commented 2 years ago

For Strings with special characters the length needs to be of the utf-8 encoded value or else there's an 1797 ADS Error.

@hkalteBr would you mind bringing an example for this? As strings are 1-Byte characters there should be the same length for type string and type byte.

stlehmann commented 2 years ago

grafik

Alright, I see it. Still this is something that actually shouldn't work because TwinCAT strings are only one byte long and you couldn't write a non-ASCII character to a string variable. You need to use WSTRING datatype for this.

hkalteBr commented 2 years ago

I'm using the attribute utf-8 encoding for some strings: {attribute 'TcEncoding':='UTF-8'} So it's a String datatype do you have another workaround?

stlehmann commented 2 years ago

This is a hard one. Honestly, I didn't know about this attribute. It will be hard to read values encoded this way because it is not determined which size a character has. Also we don't know the encoding on the client side.

hkalteBr commented 2 years ago

I've been experimenting now a little bit with the string encoding in Twincat. According to the infosys it's an cp1252 encoding for the normal string. Shouldn't we integrate that one instead of the utf-8? I tried it with the write_by_name function (adsSyncWriteReqEx) and now you can write with ASCII 256 and the plc shows it the right way. Encoding_cp1252

It doesn't solve my problem with the utf-8 encoded Strings but at least you can write and read special characters with the normal strings.

stlehmann commented 2 years ago

Thanks for looking into this. I remember having the encoding issue with symbol comments. I think we already changed the encoding to cp1252 for them. But we use utf-8 for the normal read/write operations at the moment. Would you mind sharing the link to the documentation?

hkalteBr commented 2 years ago

https://infosys.beckhoff.com/english.php?content=../content/1033/tc3_plc_intro/2529327243.html&id= I only found it for Twincat 3 but i guess it's the same in Twincat 2. Did you have a reason for using utf-8 for the normal read/write operations?

stlehmann commented 2 years ago

My initial guess is that no-one really thought about the encoding too much at that time. To address this issue I suppose we could add an attribute to the read/write functions for string_encoding. Alternatively this attribute could be set in the Connection object.

stlehmann commented 2 years ago

It is good practice to open an issue before opening a PR so I created #301 on this topic where we can further discuss the issue.