mirage / irmin

Irmin is a distributed database that follows the same design principles as Git
https://irmin.org
ISC License
1.85k stars 157 forks source link

irmin-git creates corrupt git repo #2349

Open intermet opened 2 weeks ago

intermet commented 2 weeks ago

With irmin 3.9.0, checkseum 5.1.0 and optint 3.0, I got a corrupted git repo

open! Lwt_result.Syntax
module Store = Irmin_git_unix.FS.KV (Irmin.Contents.String)
module Sync = Irmin.Sync.Make (Store)

let info = Irmin_git_unix.info

let main () =
  let branch = "main" in
  let config = Irmin_git.config "/tmp/test" in
  let%lwt repo = Store.Repo.v config in
  let%lwt _remote = Store.Backend.Remote.v repo in
  let%lwt db = Store.of_branch repo branch in
  let file = "./blob" in
  let%lwt tree = Store.get_tree db [] in
  let%lwt data = Lwt_io.with_file ~mode:Lwt_io.input file Lwt_io.read in
  let%lwt tree = Store.Tree.add tree [ "file" ] data in
  let%lwt () = Store.set_tree_exn db ~info:(info "initial commit") [] tree in
  Lwt_result.return ()

let _ = Lwt_main.run (main ())

When I try to git checkout main I get

 /tmp/test $ git checkout main
error: inflate: data stream error (invalid distance too far back)
error: inflate: data stream error (invalid distance too far back)
error: corrupt loose object '2d3978bb858c718a42b61ffb1769a4329c2a1bb7'
fatal: loose object 2d3978bb858c718a42b61ffb1769a4329c2a1bb7 (stored in .git/objects/2d/3978bb858c718a42b61ffb1769a4329c2a1bb7) is corrupt

I can post the blob file if needed.

dinosaure commented 2 weeks ago

Is it possible to know: 1) the version of decompress 2) the contents of .git/objects/2d/3978bb858c718a42b61ffb1769a4329c2a1bb7 3) the initial content (without compression) of the blob

For the latter, you can find it with git hash-object and if you find a file with the 2d3978bb858c718a42b61ffb1769a4329c2a1bb7 hash, this is the blob which Irmin try to compress.

Thanks for your report.

intermet commented 2 weeks ago

Thank you for the fast reply.

  1. I use decompress 1.5.3.
  2. Attached the hexdump : 3978bb858c718a42b61ffb1769a4329c2a1bb7.txt
  3. Attached the hexdump of the initial blob: blob.txt

I am not sure what you mean exactly in your last comment since I failed to checkout the main branch so the repository is actually empty.

Thanks for your support.