mirage / ocaml-tar

Pure OCaml library to read and write tar files
ISC License
54 stars 34 forks source link

Version 3.0.0 does not support tar+gz for lwt #148

Open jonahbeckford opened 2 months ago

jonahbeckford commented 2 months ago

The API in https://github.com/mirage/ocaml-tar/blob/503cbeab646c42044de88adafb8d6f1e357441e4/unix/tar_lwt_unix.ml#L108-L113 is missing the (Tar_gz.in_gzipped (Tar.fold f init)) that would be required to read .tar.gz files.

There is probably somewhere else in the same file to add in Tar_gz.out_gzipped.

Or is there another way to do it? I can't find any examples or tests for this use case.

reynir commented 2 months ago

Indeed! The fold is only for plain tar archive. Your observation is correct that it could be implemented as:

(* open file as [fd] *)
Lwt.finalize
  (fun () -> run (Tar_gz.in_gzipped (Tar.fold f init)) fd)
  (fun () -> (* safely close [fd] *))

And a user of the library could very well do this themselves. I think you raise some good questions: should this be documented by examples? Or should we duplicate the functions to also work on gzipped archives? I'm undecided as we may implement other compressions. Maybe @dinosaure has an opinion.

Thanks for bringing it up!

hannesm commented 2 months ago

I would like to have the functions duplicated, since copy and pasting the same 3 lines of code to all clients isn't nice, esp. if we get to modify internals / the API ;)

jonahbeckford commented 2 months ago

And a user of the library could very well do this themselves.

No. The run function is not exported in the .mli so a user can't do this themselves.

Instead, a user (me) has to copy and paste almost two hundred (200) lines of unix/tar_lwt_unix.ml and modify the fold function to add gzip.

Here is my version that adds a polymorphic tag ?compression:

  let fold ?compression f filename init =
    let open Lwt_result.Infix in
    let fold () =
      match compression with
      | None -> Tar.fold f init
      | Some `Gzip -> Tar_gz.in_gzipped (Tar.fold f init)
    in
    safe Lwt_unix.(openfile filename [ O_RDONLY ]) 0 >>= fun fd ->
    Lwt.finalize (fun () -> run (fold ()) fd) (fun () -> safe_close fd)

I think there are four options:

  1. Add a ?(compression: [`Gzip | `Bzip]) parameter supplied with tags
  2. Add a ?(compression: _ Tag.t -> _ Tag.t) parameter supplied with a function (ex. ~compression:Tar_gz.in_gzipped)
  3. Duplicate all of the functions.
  4. Expose the run function.
reynir commented 1 month ago

I am releasing version 3.1.0 which exposes Tar_lwt_unix.run. I saw that the run function was exposed in Tar_unix and Tar_eio so clearly this was an unintended omission. I'll keep this issue open for now until we have implemented gzipped functions, too.

Sorry for the delay and I hope this will unblock you.