Closed orkhands closed 2 years ago
Have you tried replacing .decode("utf8")
with .decode("latin1")
instead of your fix?
During the refactoring for version 1.0 I found a similar problem. After some research I determined that the controls probably use Latin-1 (ISO 8859-1) encoding. If latin1 works in your case it would make the fix easier as it wouldn't need the switch you introduced.
@drunsinn For sure changing the encoding would solve my specific problem, I was just thinking of general usability of the library for other users as well. Other users experiencing the same issue might benefit from the possibility to set their own encoding and/or error handling. As well, it would be charming if I could still use the pip package instead of maintaining my own fork ;) Of course, one could also solve this issue differently by adding conf file or something of this sort (again increasing complexity, I understand...)
@orkhands the problem with the encoding is with the incoming data from the control. Your application should not need to select the encoding at all. Strings in Python3 are always utf8 if not specified otherwise.
As far as I understand the protocol now there are no two-byte characters, that's why I assumed lsv2 would be ascii. Instead it uses latin1 which makes sense since Heidenhain is a german company. Some tests also support this assumption since I could not get any other encodings to work reliably.
For 1.0 the encoding/decoding of strings has already been switched to latin1, for the current release I didn't want to back port these changes. Since you experience this problem now and waiting for 1.0 would take to long changing the encoding to latin1 is the easiest way.
That's great news, thank you for your work and the 1.0 prospect :) I will proceed as proposed.
If you are interested you can take a look at the branch for 1.0: https://github.com/drunsinn/pyLSV2/tree/release_v1.0
Some adjustments to your code will be necessary but I feel they make the library a lot more consistent.
I experienced an issue when polling the program status using
get_program_stack()
, which would throw aUnicodeDecodeError: 'utf-8' codec can't decode byte 0xe4 in position xy: invalid continuation byte
Reason in my case is that the machine program running has a german "Umlaut" in its file name (e.g., TNC:\X\Y\Z\xänderung.h', which does not seem to be encoded correctly. Since the error is thrown within the pyLSV2-lib, I can't just ignore the error and continue, since the rest of the queried information from
get_program_stack()
is then gone, too.To solve this issue, one can either specify the desired encoding in the
decode()
method, or one sets the error handling of the same method to either ignore undecodable bytes or replacing them. It seems both approaches can be useful for the user, so I suggest to provide some parameters for the user to set both options.While the library decodes many values from the machine, it seems, that only in
get_program_stack()
values are transmitted which originate from the user instead of the machine itself. Thus, I would expect that we do not see these issues in any other part of the code. Yet, I might be wrong here and having these options would be a good idea at other places as well.Changes made in
client.py
:__init__
allowing a user to set the strategy for handling encoding errors. Default value corresponds to default value of encoding indecode()
method (encoding = "utf-8")__init__
allowing a user to set the desired encoding to be used. Default value corresponds to default value oferrors
in decode() method (errors="strict")get_program_stack()