ocaml / opam

opam is a source-based package manager. It supports multiple simultaneous compiler installations, flexible package constraints, and a Git-friendly development workflow.
https://opam.ocaml.org
Other
1.23k stars 351 forks source link

Use ocamlpro.com archive only if upstream tarball not available #39

Closed gildor478 closed 11 years ago

gildor478 commented 12 years ago

Seems like .tar.gz are stored on ocamlpro server. This is a good backup solution, but it should only be a backup, it is better to directly upload from upstream website.

Uploading from upstream website, allow them to "count download", which is a quite important metric for various reason.

lefessan commented 12 years ago

Yes, archives are preferably downloaded from ocamlpro.com, and only on failure downloaded from the upstream website.

The "count download" is important for everybody, for the upstream website, but also for us, to get a better understanding of the community needs. An clearly, it is easier for us to publish these statistics so that they are available for everybody, than to depend on every upstream website will to publish their statistics and gather them afterwards.

gildor478 commented 12 years ago

Why not issuing a 302 or 301 (redirection) on ocamlpro.com to the upstream website and counting downloads using 301/302 on ocamlpro.com ?

This way you keep ocamlpro.com download count and upstream website download count. And you don't force upstream website to rely on OCamlPro website to gather its statistic.

lefessan commented 12 years ago

Then, (1) statistics would be harder to understand, ocamlpro.com statistics would include both direct downloads and indirect downloads, and (2) each archive would need two URLs, one for redirection and another one for direct download in the case where the upstream site is down.

So, this would make everything more complex, without providing anything new, since the devs will have access to their statistics on ocamlpro.com.

gildor478 commented 12 years ago

Well, on the other hand it will make 1) statistics self standing for ocamlpro.com and 2) it is not like you'll be the first to do that (GODI and odb does it after all, I suppose it is not that a big challenge for you).

I don't really buy the "more complex" argument. Using upstream website is how it is done in most "port" like system I know, for good reasons (IMHO, it is a question of being fair with upstream and giving credits to the right people). Have a look here: http://www.freebsd.org/cgi/cvsweb.cgi/ports/net/6tunnel/Makefile?rev=1.20

You can also have a look at MacPorts...

I consider that being fair to community is to help them being effective, not to have them spend their time on setting up new counter for new package manager.

lefessan commented 12 years ago

Modifying OPAM is not a challenge for anybody, but modifying our webserver configuration is ;-)

My main problem right now is making opam as usable as possible for the users, no the package maintainers. When there will be more than 100 users of opam, we will start thinking about how to please package maintainers too, but there is no point spending time for them if there are no users...

lefessan commented 12 years ago

By the way, I am still not convinced by these arguments. "Other people do it" is not a good reason (I am pretty sure I can find other people who do the opposite), and providing a centralized site with all statistics on OCaml packages is actually the best way to be "fair to the community".

gildor478 commented 12 years ago

Let's put it the other way: as upstream of 7 packages (fileutils, gettext, oasis, ocaml-data-notation, ocamlify, ocamlmod, ounit, ~11% of opam). Please, can link directly to the upstream website ?

gildor478 commented 12 years ago

Concerning configuration of the webserver, if it's apache, you just have to generate a .htaccess that contains the 301/302 in archives/ and put the actual content of archives in archives/backup/.

All in all, coding + testing of this should take you 0.5 day.

gildor478 commented 12 years ago

BTW, I think the number of LoCommnet in this issue is bigger than the LoCode require to implement this feature.

avsm commented 12 years ago

Statistics from either an OPAM hit count or the authors are going to be very suspect, and skewed towards source package managers (such as Homebrew or OPAM) and away from binary ones (Debian). I'm more concerned by issue #42, since OPAM currently modifies upstream packages without a version bump.

Having control over whether files are fetched from a local repository, ocamlpro.com, or the origin site would be useful for corporate installations. In the XenServer build system, for example, all upstream distfiles are locally cached, since the build farm doesn't have external network access for security reasons. If OPAM cannot optionally fetch distfiles from a local web server, it cannot be integrated into this setup.

avsm commented 12 years ago

I'm also working on OpenBSD support for OPAM (#38), and that supports systrace policies which can restrict the network access privileges of a process. It would be quite nice to install a default policy that only grants access to (e.g.) ocamlpro.com instead of arbitrary websites.

gildor478 commented 12 years ago

To be honnest, I don't like the fact that Debian/Fedora/whatever package manager doesn't allow to have clear statistics. It doesn't help you to know your user.

And I 100% agree that any "download" statistics are biased. Although, it cheers me up as upstream to see this number grows and helps me to stay motivated... (psychological factor, although we are still human).

gildor478 commented 12 years ago

If the system has restriction, I think going back to local archive or whatever is not a big deal. I just think that OPAM should try to do its best by default and allow to fallback.

Fabrice, BTW your first user will probably ALSO be upstream developers. Most of the package I see on OPAM are not end-user products (i.e unison, mldonkey or coq). So doing the best for upstream and end-user is probably a good way to start with.

samoht commented 12 years ago

Having control over whether files are fetched from a local repository, ocamlpro.com, or the origin site would be useful for corporate installations.

Actually, it is already possible to do that with OPAM. If the server doesn't serve the archive files, the client will download it from upstream. So if an organization wants to have a local repo with no links to the outside world, it just provides an archive for each packages -- if it want the clients to download from upstream only a specific packages, it just doesn't put the corresponding archive on the server -- if it wants the client to always download from upstream, it just has to have an empty archives/ directory.

avsm commented 12 years ago

@samoht : thanks for the clarification about local builds. I really like that aspect of the OPAM design.

gildor478 commented 12 years ago

Is this bugs still tagged "won't fix" or do I have enough points to at least make this a feature that some users/upstreams would like to have in mid/long-term future ?

lefessan commented 12 years ago

As Thomas pointed it out, it is the admin configuration that decides if packages are locally cached or not. So, your feature is already granted.

gildor478 commented 12 years ago

What about turning it on by default ? One line that will make me happy. If I understand @samoht correctly, it means leaving archives/ empty on opam.ocamlpro.com...

avsm commented 12 years ago

If you leave archives/ empty on opam.ocamlpro.com, then how can it fallback if the origin site is down?

It also makes more sense (to me anyway) to keep the current setup and have reliable statistics on OPAM package use, rather than lose that information. There's no way to usefully reconstruct it from all the hit counts of thousands of external packages. This is also consistent with the Hackage model in Haskell, except authors upload a package to the site there.

gildor478 commented 12 years ago

@avsm I would 100% prefer to have a reliable way to have the best of both world (i.e reliable stats on both OPAM and upstream website). Hence the proposal of 301/302, which is what is implemented for odb.ml through oasis-db.

Since OPAM is not the main way to distribute my tarballs and that the stats page that @lefessan is talking about doesn't even exist (at least I am not aware of it), there is no way for me to see these stats. And if this page exists, I need to have an API/easy way to get stats out of it.

From my POV, the current setup doesn't give you the least stats about what is happening.

samoht commented 12 years ago

The logic I described above (ie. download on the server first and upstream next) is fully implemented (and tested) in 0.4.0.

We don't have (yet) a way to gather statistics about packages downloads, so I leave the issue open for now on.

samoht commented 11 years ago

We now have a nice statistic page at http://opam.ocamlpro.com.

The problem than I can now see if that indeed, the statistics are not very meaningful. Not surprisingly, the most popular package is ocamlfind (followed by oasis and its dependencies) - the really interesting stats will come later, when we will be able to know the number of opam install $package.

UnixJunkie commented 11 years ago

People could be asked when they install OPAM if they want to take part into the package popularity contest. A la Debian.

gildor478 commented 11 years ago

Thanks for the number on the main page. Would it be possible to publish this data in a CSV files an the website (http://opam.ocamlpro.com/stats.csv, you could also add other stats) or through whatever public REST api ?

That is great to see OASIS in the top ten (which converge with what you can see on http://forge.ocamlcore.org).

Although OASIS should not have that many reverse deps is a really old version (i.e. ANSITerminal should not depend on OASIS).

gildor478 commented 11 years ago

Although OASIS should not have that many reverse deps is a really old version (i.e. ANSITerminal should not depend on OASIS).

->

Although OASIS should not have that many reverse deps (i.e. ANSITerminal should not depend on OASIS).

samoht commented 11 years ago

ANSITerminals has a broken setup.ml (don't know exactly why) so we need oasis to regenerate it.

Also, I'm now exporting the download stats as CSV and JSON: http://opam.ocamlpro.com/stats.csv http://opam.ocamlpro.com/stats.json

gildor478 commented 11 years ago

Just build oasis, using opam;-) and run "oasis setup" in the topdir of a vanilla ANSITerminal, diff the result with vanilla archive, create a patch and use the patch in opam (I think opam support patches).

You can also ask ANSITerminal upstream to generate a new tarball (Christophe AFAIR).