Open arvidj opened 1 year ago
Thanks @arvidj it looks like you're still working on this so I'll just leave some questions here.
Exposing the media_type
looks right, up till now there wasn't a reason to.
Do you plan on filling in the remaining features for Job Artifacts
?
Have you tested the downloading cli using large artifacts? I didn't add a streaming response suitable for downloading binary artifacts so it might not be very efficient.
Thanks @arvidj it looks like you're still working on this so I'll just leave some questions here.
Hi, it should be mostly finished but I haven't reviewed the code myself yet.
Exposing the
media_type
looks right, up till now there wasn't a reason to.Do you plan on filling in the remaining features for
Job Artifacts
?
For the moment, I only needed those three endpoints so I exposed only them. I could do the rest if you prefer.
Have you tested the downloading cli using large artifacts? I didn't add a streaming response suitable for downloading binary artifacts so it might not be very efficient.
No, I haven't tried it on large artifacts. TBH, I have a very weak understanding of how to handle binary data efficiently, so I was happy to see that this "just worked" for the smaller files I tried. I'd be glad to do a more robust implementation but I would need some pointers to get started. I'm trying now to download the artifacts of:
https://gitlab.com/tezos/tezos/-/jobs/3330701870/artifacts/browse
A weird thing I'm noticing is that GitLab intermittently returns 401
although I have a token configured. And I verified that ocaml-gitlab sends along the headers. I'm writing it off as a API flakiness for now. Although it'd be nice to have better output in this case. A complication is that the GitLab is in XML in this case!
🤡 OCAMLRUNPARAM=b GITLAB_DEBUG=1 dune exec cli/main.exe -- artifact get --project-id 3836952 --job-id 3330701870
>>> GitLab: Requesting https://gitlab.com/api/v4/projects/3836952/jobs/3330701870/artifacts (headers: accept: application/octet-stream
Authorization: Bearer glpat-[OMITTED]
user-agent: ocaml-gitlab ocaml-cohttp/5.0.0
)
>>> GitLab: Response code 302 Found
>>> GitLab: Requesting https://cdn.artifacts.gitlab-static.net/da/18/da1836cc3c2c6f6645e6d2a8a03c4f819e8256dfce5953aee56bfede82db502c/2022_11_16/3330701870/3635947479/build-x86_64-yy-tezt-typechecking.zip?Expires=1668593099&KeyName=gprd-artifacts-cdn&Signature=zEpStiJpX2i3PZMwsZNg_HVsNq4= (headers: accept: application/octet-stream
Authorization: Bearer glpat-[OMITTED]
user-agent: ocaml-gitlab ocaml-cohttp/5.0.0
)
>>> GitLab: Response code 401 Unauthorized
>>> GitLab: response body:
<?xml version='1.0' encoding='UTF-8'?><Error><Code>AuthenticationRequired</Code><Message>Authentication required.</Message></Error>
lab: internal error, uncaught exception:
Yojson.Json_error("Line 1, bytes 0-33:\nExpected '{' but found '<?xml version='1.0' encoding='UTF'")
Raised at Yojson.json_error in file "common.ml", line 5, characters 19-39
Called from Yojson.Safe.read_lcurl in file "lib/read.ml" (inlined), line 2319, characters 3-42
Called from Gitlab_j.read_message in file "lib/gitlab_j.ml", line 40752, characters 4-31
Called from Gitlab_core.Make.API.handle_response.(fun) in file "lib/gitlab_core.ml", line 714, characters 30-61
Called from Lwt.Sequential_composition.bind.create_result_promise_and_callback_if_deferred.callback in file "src/core/lwt.ml", line 1860, characters 23-26
Re-raised at Lwt.Miscellaneous.poll in file "src/core/lwt.ml", line 3068, characters 20-29
Called from Lwt_main.run.run_loop in file "src/unix/lwt_main.ml", line 31, characters 10-20
Called from Lwt_main.run in file "src/unix/lwt_main.ml", line 118, characters 8-13
Re-raised at Lwt_main.run in file "src/unix/lwt_main.ml", line 124, characters 4-13
Called from Cmdliner_term.app.(fun) in file "cmdliner_term.ml", line 24, characters 19-24
Called from Cmdliner_eval.run_parser in file "cmdliner_eval.ml", line 34, characters 37-44
However, once the file downloaded, it was identical to the file downloaded by curl
. With curl, the download took ~20 seconds, and with lab
it took ~1 minute.
@tmcgilchrist any hints on how to move forward on this? Do you think I need to:
add a streaming response suitable for downloading binary artifacts
? I see that Cohttp_lwt.Body
exposes to_stream
so the issue would be just threading that through the API
module (and dealing with the semantic confusion of having two kinds of streams ;)).
It doesn't seem too hard change the module API
so that handler functions receive CLB.body
instead of the body string
directly. From there, we can choose to return a string Lwt.t
to the user, so that Job_artifacts.get : string Lwt_stream.t option Response.t Monad.t
(that signature is a mouthful). I read online though that Lwt_streams
are frowned upon. Another option is returning the CLB.write_body
function so that:
Job_artifacts.get : (string -> unit Lwt.t) option Response.t Monad.t
this way, the user is not exposed to any Lwt_stream.t
and can write the each chunk as they desire.
I'm not familiar with Lwt_stream
and the objections raised against it in https://github.com/ocsigen/lwt/issues/250 are academic to me. I'm fine either way. I'll make a draft returning a stream, it should be easy to change that part anyhow.
Thanks for the link to the Lwt_streams flaws thread, interesting read. My expectation for how Lwt_streams should work matches the second model from https://github.com/ocsigen/lwt/issues/250#issuecomment-222146474 so it's interesting that it doesn't necessarily hold.
I would expect the Lwt_stream usage in this library to involve one writer and one reader when downloading/uploading artefacts, so the objections probably don't apply. I think following a model where you use the CLB.to/from_stream
functions should work, though I haven't tried writing it myself.
Adds the
Job_artifacts
module and implements three endpoints from https://docs.gitlab.com/ee/api/job_artifacts.htmlUpdated the
lab
tool with corresponding commands.To do this, I had to expose the
media_type
argument ofAPI.get
.