Open hvr opened 8 years ago
Note that the preference mechanism can be used for this kind of beta release.
Fine by me, as long as there's clear instructions how to upload experimental packages.
Here's another experimental package with a short dictionary word, jump, clearly marked as placeholder ("synopsis: Nothing to see here, move along") polluting the Hackage index.
This shows that even experienced users tend to misunderstand Hackage as being their personal testing ground and uploading dummy versions of packages not passing the basic threshold of being even intended to be used by others.
I agree with everything including
A more drastic way would be to require approval when new package names are being created
but have no good ideas how to make approval process fair and responsive
jump
is a bad example. The initial upload is clearly a name squat, since if you look at the GitHub repository they clearly intend to release an alternative base
under this name. You can see in the repository that they are developing the package in good faith and I think it's OK for them to do this.
what should our policy be on name squatting in general? like how to distinguish between a "good faith" one and a not good faith one?
BTW, other package managers do very poorly with this. http://incolumitas.com/2016/06/08/typosquatting-package-managers/ https://phpsec.xyz/composer-typosquatting-vulnerability-877d263509ec#.xuw039sz6
CPAN doesn't have strict policies, it appears. But it has some nice author guidelines we may want to rip off:
npm has some stricter actual policies: https://www.npmjs.com/policies/disputes
(see also: https://www.npmjs.com/policies/conduct)
@ezyang it doesn't matter whether it's done in good faith or not. It doesn't change the fact that such dummy releases don't help anybody, and therefore shouldn't needlessly bloat the published Hackage index tarball. If it's about reserving a name, there's a different mechanism to do that. If you publish a package to the public package index, it's supposed to be useful to people other than the package author, which jump-0.0.0.0
clearly is not.
UPDATE: jump
was officially deprecated a few months after the "name squat"; its README on GitHub now states
This project has been deprecated in favor of two new projects: ...
- haskell-lang (live website) is the new destination for Haskell documentation
- Foundation is a more active and innovative standard library rethinking in Haskell
Both are active and welcoming community projects, please get involved!
So at this point jump
is just a dead corpse which never had any useful release, and yet everyone has to download, store it, and process it during index traversals, as it is forever enshrined in the package index.
I think we resolved the immediate issue here with the statement on the upload page that "your package should strive to provide value for the community by being intended to be useful to others." Other aspects of this are being discussed with the uncurated ecosystem proposal. I suggest closing this particular ticket as superseded?
@gbaz let's close this only after proposal is accepted.
FWIW, proposal doesn't address name squatting issue.
I think we want to have some restrictions in uncurated index still. At least have some (lighter than current package overtake) procedure to reclaim package name if new maintainer wants their new package to be curated, and old one hadn't any curated versions.
Yet, I don't think we should complicate your proposal with such detail at the moment, so I'd leave this issue open for now.
Even now, while we have
"your package should strive to provide value for the community by being intended to be useful to others.
we have no "what would happen if you don't" clause.
Just today I noticed https://hackage.haskell.org/package/wsdl-0.1.0.0 which comes with a big "DO NOT USE, UNSTABLE AND INCOMPLETE." disclaimer in its description.
IMO, such packages don't belong into the main 00-index.tar, as they're clearly not meant for public consumption yet. And such uploads add to the self-fulfilling prophecy (c.f. broken window theory) that Hackage has no quality standards and anything goes.
I'm not sure what the motivation/assumption for uploading a package to Hackage is, but I've noticed in the past that uploaders often didn't know about the Hackage candidate feature, and just wanted to try out the workflow.
In any case, as soon as a package becomes part of the 00-index.tar, it becomes a package that causes overhead for several entities (including us Hackage Trustees ;-) ). It gets picked up by search engines, is considered by Hackage's own package search, gets picked up by matrix.h.h.o eventually, etc.
Also, experiments that end up in a dead end effectively use up precious names from the package namespaces seem troublesome to me (sure, the package names could be reclaimed in theory, but it's very confusing if a package changes its scope/purpose completely depending on the version -- so this should rather be the exception). A name like
wsdl
is certainly one of the premium names which deserve to be handled with more responsibility, as such a principal name suggest to be the blessed "go to packages" for a given task.So a package added to 00-index.tar should ideally satisfy a few baseline requirements, IMO.
More specifically, a package uploaded as non-candidate ought to come with a bit more responsibilty to improve the overall quality of Hackage packages (and keep the Trustee-workload manageable). So, for non-candidate uploads I suggest something along the lines of:
cabal upload
ought to upload as candidate by default (unless a--no-candidate
flag is used). Or some other mechanism that increases the threshold of uploading packages straight to the index without consideration.A more drastic way would be to require approval when new package names are being created (i.e. you'd still be able to upload candidates for new packages names w/o approval, but publishing a new package name to the main index for the first time would require such an approval). We'd need to make sure that the approval process takes at most 24h or so, by having a large enough group of people being able to approve a new package name.
/cc @bergmark @dcoutts @gbaz
Related, there's also the issue of trivial packages using up short package names, but failing the equivalent of the Fairbairn-threshold for packages:
sort
: https://hackage.haskell.org/package/sort-0.0.0.1/docs/src/Data-Sort.htmlsorting
: http://hackage.haskell.org/package/sorting-1.0.0.1/docs/src/Data-Ord-Sorting.htmlelo
: https://hackage.haskell.org/package/elo-0.1.0/docs/src/Statistics-Elo.htmlconf-json
: https://hackage.haskell.org/package/conf-json-1.1/docs/src/Data-Conf-Json.htmlOther premium names taken (although maybe with a less clear verdict whether they fall below the threshold):
functor
: http://hackage.haskell.org/package/functorcategory
: http://hackage.haskell.org/package/categoryproduct
: https://hackage.haskell.org/package/product-0.1.0.0/src/Control/Category/Product.hsA different class of questionable packages are "personal" packages which appear to have an audience of one, the author himself:
rfc
: https://hackage.haskell.org/package/rfcutil
: https://hackage.haskell.org/package/util-0.1.0.0/docs/src/Util.html(TODO: add more examples)