sassoftware / pyviyatools

Python command-line tools that call the SAS Viya REST APIs - for SAS administrators.
Apache License 2.0
39 stars 31 forks source link

creategroups.py produces error (UnicodeEncodeError: 'latin-1' codec can't encode character '\ufeff' in position 8) #143

Closed tomstarr closed 1 year ago

tomstarr commented 1 year ago

Hi guys, I'm back to working with pyviyatools in anger as I work to automate the implementation of Auth Models from a spreadsheet. Today I've come across a couple of issues with the creategroups.py script and I'm hoping you have a bit of a bandwidth to make a couple of tweaks.

  1. When I try to import a simple CSV file (using python3 - possibly relevant?) then I encounter the issue:

Traceback (most recent call last): File "./creategroups.py", line 109, in myresult=callrestapi(reqval,reqtype,data=data,stoponerror=0) File "/opt/sas/viya4/pyviyatools/sharedfunctions.py", line 135, in callrestapi ret = requests.post(baseurl+reqval,headers=head,data=json_data) File "/usr/lib/python3/dist-packages/requests/api.py", line 116, in post return request('post', url, data=data, json=json, kwargs) File "/usr/lib/python3/dist-packages/requests/api.py", line 60, in request return session.request(method=method, url=url, kwargs) File "/usr/lib/python3/dist-packages/requests/sessions.py", line 533, in request resp = self.send(prep, send_kwargs) File "/usr/lib/python3/dist-packages/requests/sessions.py", line 646, in send r = adapter.send(request, kwargs) File "/usr/lib/python3/dist-packages/requests/adapters.py", line 439, in send resp = conn.urlopen( File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 665, in urlopen httplib_response = self._make_request( File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 387, in _make_request conn.request(method, url, **httplib_request_kw) File "/usr/lib/python3.8/http/client.py", line 1256, in request self._send_request(method, url, body, headers, encode_chunked) File "/usr/lib/python3.8/http/client.py", line 1301, in _send_request body = _encode(body, 'body') File "/usr/lib/python3.8/http/client.py", line 164, in _encode raise UnicodeEncodeError( UnicodeEncodeError: 'latin-1' codec can't encode character '\ufeff' in position 8: Body ('\ufeff') is not valid Latin-1. Use body.encode('utf-8') if you want to send it encoded in UTF-8.

After a bit of research, I've found that this can be quickly remedied by updating the line 79 of creategroups.py from: with open(file, 'rt') as f:

to: with open(file, 'rt', encoding='utf-8-sig') as f:

I'm not sure what other effects this has on the use of this script, but could you either implement this change or add a parameter that allows the setting of an encoding value when an import is run please?

Whilst you're reviewing that script, I've also noted two other things that it'd be great if you could look into catching.

  1. An error is produced when a new csv file being imported contains a empty 'member id' (column 4) field: e.g. this csv will import ok "persona_platformadm_prd","Persona: PROD Platform Admin","" but this csv entries seems to work, but produces a very confusing "http 405" error "persona_platformadm_prd","Persona: PROD Platform Admin","",""

ERROR: Note: Trying to creating Group: Persona: PROD Platform Admin Note: Group: Persona: PROD Platform Admin created Note: Trying to add user to group Persona: PROD Platform Admin http response code: 405 ret.text: {"errorCode":0,"message":"The HTTP method \"PUT\" is not supported.","details":["traceId: 970229be1a574105","path: /identities/groups/persona_platformadm_prd/userMembers/"],"remediation":"Specify one of the following HTTP methods: \"GET\".","links":[],"version":2,"httpStatusCode":405}

  1. There seems to be a minimum of 3 columns needing to be in each row of a csv file, otherwise an 'out of range' error is presented. Could this error be caught and nice message presented, perhaps when the file scan occurs it could validate for a minimum of 3 columns in each row?

  2. Finally, not an issue but an enhancement request. Could you add an option to skip the first row (header) of a csv file when the script is run please? For now, I'm automated the export of csv files without a header to mitigate this issue, but I'd prefer to have this included if this csv format was supported by creategroups.py.

Thanks heaps!

tomstarr commented 1 year ago

Since posting this comment, I've also found that another Unicode error is encountered when certain characters (e.g. a dash "-") are included in a csv's 'group name' column/field. This occurs even with the 'utf-8-sig' change that I've requested above. I guess there must be something else that's reading the csv with latin encoding too. Interesting, the use of these characters is acceptable in the 'id' column's fields.

Any ideas where this read could be hitting it's encoding issue?

UnicodeEncodeError: 'latin-1' codec can't encode character '\u2013' in position 72: Body ('–') is not valid Latin-1. Use body.encode('utf-8') if you want to send it encoded in UTF-8.

gerrynelson63 commented 1 year ago

@tomstarr I will take a look. We have had encoding issues with other tools.

tomstarr commented 1 year ago

Resolution merged into master in #145, closing issue.