pkgcore / pkgdev

collection of tools for Gentoo development
https://pkgcore.github.io/pkgdev/
BSD 3-Clause "New" or "Revised" License
29 stars 11 forks source link

[Undesirable Behaviour] Why does pkgdev care about the syncer URI? #182

Open Kangie opened 6 months ago

Kangie commented 6 months ago

https://github.com/pkgcore/pkgdev/issues/43 and https://github.com/pkgcore/pkgcore/pull/434 exist in order to band-aid a 'pkgdev does not like the sync uri' in a given git repo.

A better question here is 'WHY are we checking this at all'? As I understand it, pkgdev relies on git (or another VCS) to do the pushing, and a VCS is perfectly capable of informing users that it thinks they're smoking crack if they provide an invalid src_uri.

A related question is 'why is this check invoked for simple tasks like pkgdev manifest'? Generating a package manifest should work regardless of if I have the src uri set to git@gentoo.org/... or 'a carrier pigeon released at summer solstice at an altitude of 15000M over the Simpson desert.'

Please rework pkgdev to avoid superfluous input validation. It's not up to pkgdev to say 'I don't like your sync uri, you cannot generate a manifest'

ferringb commented 3 months ago

My general view- and industry view- is input sanitization should be done at the point of deserialization; the underlying pkgcore internal is what's validating this because it's part of the portage configuration. Whilst that could be relaxed, that winds up leading to pkgcore having to drop the syncer entirely (since it doesn't know how to create the syncer) which is crappy user experience.

The way pkgcore's config internals are built, it is possible to discard a syncer it can't parse. Doing so would make this user happy at the cost of other users wondering why their repo won't sync after the sync's have all ran.

ferringb commented 3 months ago

A potential compromise here is enhancing the error message to 1) include the repo_id of what was failed to be parsed, and 2) to list the known syncers.

It'll still be annoying if there is some syntax that has changed- that git@ wasn't around in the early days- but it would be better for user confusion imo, including clarifying workarounds so they can do their thing and nudge upstream to improve things.

What are others thoughts here?