ocaml / opam-repository

Main public package repository for opam, the source package manager of OCaml.
https://opam.ocaml.org
Creative Commons Zero v1.0 Universal
516 stars 1.12k forks source link

gforge.inria.fr was shut down #19757

Open mseri opened 2 years ago

mseri commented 2 years ago

Many packages some also used by opam are now unavailable. For example mccs is uninstallable because cudf is no longer downloadable.

I think they may have moved to gitlab.inria.fr but I didn’t check. We should probably prioritise this issue

xavierleroy commented 2 years ago

I sent some e-mails concerning CUDF. AFAIK projects were not automatically migrated from gforge.inria.fr to gitlab.inria.fr, but ample early notice was given and tools to help with the migration were provided.

avsm commented 2 years ago

I've created a new https://github.com/ocaml/opam-source-archives to host selective archives that are important and prone to disappearance from elsewhere. This should tide us along until the Software Heritage archive is available next year for use with opam 2.2

xavierleroy commented 2 years ago

@zacchiro just moved CUDF to its new home: https://gitlab.com/irill/cudf

Note that dose3 also resides in the same organization: https://gitlab.com/irill/dose3

Hope this helps!

smorimoto commented 2 years ago

The dose3 archives seem to have different hash values (https://github.com/ocaml/opam-repository/pull/19807), and we found a few cases during this transition where the actual code content was slightly different (e.g. https://github.com/ocaml/opam-repository/pull/19602). Are they really the same?

smorimoto commented 2 years ago

If it's not clear, I got a tool from a friend who lives in a different language team that makes it easy to find actual differences and so on, so I can use it to investigate all.

xavierleroy commented 2 years ago

Re: dose3, I was just repeating what @treinen (one the main developers of dose3) mentioned to me by e-mail. I'm pretty sure https://gitlab.com/irill/dose3 is the new home and the place where maintenance and development takes place. I don't know about reproducing the tarballs for versions released before the move to gitlab.com.

zapashcanon commented 2 years ago

Note that everything that was in gforge.inria.fr and in opam has already been archived in swh, e.g. cudf.

smorimoto commented 2 years ago

I see. And it's good to know that swh already has them!

abate commented 2 years ago

reg dose3. A related problem with gitlab is that from time gitlab regenerates tarball of the package. I've seen this problem in the past as well. A workaround that I'll try to quickly implement is to generate and make available on gitlab a tarball that I generate myself. This should avoid hash mismatch in the future. Not sure if this is what happened this time.

If somebody have other ideas how to solve this annoying problem with gitlab, I'm all hears.

smorimoto commented 2 years ago

Hmm? Does GitLab actually generate files that are not the same? That is quite weird.

abate commented 2 years ago

for some reason ( that I don't undestand ), the hash of the tarball generated by gitlab change and this clarly is a probem with opam. But I think this is not related to this issue I think.

zacchiro commented 2 years ago

Both tar-ing and gzip-ing are not reproducible operations out of the box. (They of course roundtrip, but there is no guarantee that re-tar-ing or re-gzip-ing the same content will always give you a bit-by-bit identical file.) This is a problem that reproducible builds have encountered many times, and fixed in specific versions of zip and tar, but it depends on what gitlab uses. Why doesn't opam use git clone --depth 1, instead of generated tarballs? But anyway, @abate is right that this looks like an entirely separate issue.

avsm commented 2 years ago

Why doesn't opam use git clone --depth 1, instead of generated tarballs

opam is simply following the instructions of the maintainers who generate the package description. They claim in the package description that a given tarball has a given checksum, which then proceeds to become false when gitlab/github regenerates the archive.

opam thus cannot distinguish a malicious attack from a simple recompression. In the current model, we'll need package maintainers to ensure their upstream archives don't change. In the medium term, we will likely start hosting the package archives ourselves as it's too much trouble to expect 30,000 packages to remain unchanging upstream.