fastai / ghapi

A delightful and complete interface to GitHub's amazing API
https://ghapi.fast.ai/
Apache License 2.0
611 stars 63 forks source link

UnicodeDecodeError on actions.download_artifact #22

Closed YannickJadoul closed 3 years ago

YannickJadoul commented 3 years ago

I might obviously be missing something, but I'm getting a UnicodeDecodeError, when trying to download an artifact:

>>> api.actions.download_artifact("YannickJadoul", "Parselmouth", 28315202, "zip")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/yannick/.local/lib/python3.6/site-packages/ghapi/core.py", line 60, in __call__
    return self.client(self.path, self.verb, headers=headers, route=route_p, query=query_p, data=data_p)
  File "/home/yannick/.local/lib/python3.6/site-packages/ghapi/core.py", line 104, in __call__
    route=route or None, query=query or None, data=data or None)
  File "/home/yannick/.local/lib/python3.6/site-packages/fastcore/net.py", line 175, in urlsend
    return urlread(req, return_json=return_json, return_headers=return_headers)
  File "/home/yannick/.local/lib/python3.6/site-packages/fastcore/net.py", line 115, in urlread
    if decode: res = res.decode()
UnicodeDecodeError: 'utf-8' codec can't decode bytes in position 68-69: invalid continuation byte

It seems like the download bytes are attempted to be decoded to a unicode string? I hope I didn't miss any advice in the docs on how to disable this.

The issue doesn't seem to be happening for e.g. git.get_blob, because the response there is base64-encoded, while GhApi.get_content does return some bytes (though it's not possible to download artifacts this way, as far as I know).

pradeepbbl commented 3 years ago

I had the same issue +1

dirkcgrunwald commented 3 years ago

I'm having the same problem. The issue is not in ghap, but rather in fastcore/urlsend:

~/opt/anaconda3/lib/python3.8/site-packages/ghapi/core.py in __call__(self, path, verb, headers, route, query, data)
    102         headers = {**self.headers,**(headers or {})}
    103         if path[:7] not in ('http://' class="ansi-blue-fg">,'https:/'): path = GH_HOST+path
--> 104         res,self.recv_hdrs = urlsend(path, verb, headers=headers or None, debug=self.debug, return_headers=True,
    105                                      route=route or None, query=query or None, data=data or None)
    106         if 'X-RateLimit-Remaining' in self.recv_hdrs:

If you look at https://github.com/fastai/fastcore/blob/master/fastcore/net.py#L170 you see that there is a return_json flag -- however urlread has a separate flag decode that we need to be able to pass through (see https://github.com/fastai/fastcore/blob/26a818f505d1c343b526c38f78ead6423363f5a7/fastcore/net.py#L107 )

This means that changes would be needed to the fastai/fastcore library. This is a fairly simple change, is it possible to get a maintainer to do this? I will file an issue there.

There is a work around - you can retrieve the archive_download_url from the artifacts list and then use that with urllib or requests. The following is an example of retrieving a file named in an artifact:

image

pradeepbbl commented 3 years ago

the issue with downloads is fixed now.

YannickJadoul commented 3 years ago

@pradeepbbl Thanks, I can confirm this works for me now! For anyone else reading this: the main thing is to make sure to not just pip install --update ghapi, but also do this for fastcore.

Thank you very much for resolving this; I'll close the issue now!