Open pdewilde opened 9 months ago
Seems like we hardcoded the gcs filename and therefore ignore the extension of the log file we download from GitHub
Seems like there may be a bit more complication than I thought. We say we will accept "application/vnd.github+json", but unless we request the gzip encoding, my understanding is that the body should be transparently uncompressed by the http client.
There are a few options I need to look into:
I'll need to get some github credentials to reproduce the actual http requests locally to figure out what exactly is going on.
https://superuser.blog/golang-http-gzip-compression
TODO: read that
GCS will apply gzip compression for transit if the client accepts it. Here's an example of writing a tgz object to GCS.
OK, then I'm suspecting that its the body from the GitHub api that is getting zipped but its not getting unzipped by the http client for some reason, I'll have to take a closer look.
I wouldn't expect that as the content-type we said we accepted was a json type, not application/zip
It's pretty nuanced, but https://cloud.google.com/storage/docs/transcoding. Content-Encoding
is probably more relevant here. Similarly, Accept
and Accept-Encoding
.
TL;DR
Logs are being saved in gcs with the file extension
.tar.gz
, but the archives are actually zip files. The file extension should either be updated to.zip
or archives should be compressed to.tar.gz
format and existing files should be re-compressed.Observed behavior