andk / pause

Perl authors upload server
http://pause.perl.org/
150 stars 57 forks source link

new feature: heuristics for informing the user of a possibly-unintentional "duplicate" upload #53

Open karenetheridge opened 11 years ago

karenetheridge commented 11 years ago

context below.

In summary, user error between two maintainers of a dist resulted in two distributions being uploaded with the same version. While this is allowed by current PAUSE rules, and it is not on the table for this to change immediately, I think it would be useful to the user if PAUSE identified potential erroneous uploads with an informational message about repeated uploads with the same version number. This is sometimes intentional, but often not, so providing this information lets the user correct the situation if there indeed was an error.

original problem report: http://www.nntp.perl.org/group/perl.modules/2013/05/msg86181.html

09:46 <xdg> What's up?
09:54  * xdg is going to lunch.  But email me whatever you need and I'll check 
          later.
11:16 <ether> sorry, very busy here too.  I was just wanting to touch base 
              briefly about the PAUSE indexing issue since it seemed we were 
              talking past each other
11:16 <ether> you seemed to be saying both "yes it's a good idea (to allow 
              different authors to upload the same dist name), and you're crazy 
              if you don't see that", and also "this is the way it is now, but 
              we'd like to change it"
11:17 <ether> which is somewhat contradictory
11:17 <ether> the question I was actually asking was "is this behaviour 
              intentional"
11:17 <ether> to which it sounds like the answer is "regrettably yes (but this 
              might change in the future)"
11:18 <ether> I guess the corollary to my question would be "is this behaviour 
              documented, and where"
11:30 <ether> but I mind the lack of docs less if there is a genuine desire, 
              coupled with tuit allocation, to fix the crazy
11:31 <ether> (I can help, too, if I understand what is needed)
12:19 <xdg> I'm back.  You around?
12:20 <xdg> I think I was trying to explain both "here's what it does now" and 
            "here's why".
12:24 <xdg> the "can't upload same file" is documented in "About PAUSE".  I 
            don't see the "versions must be non-decreasing" part documented 
            anywhere, so that might be just tribal knowledge.
12:26 <xdg> Back to "policy", I can't think of a good way to allow different 
            authors to upload the same file in some cases but not others.
12:27 <xdg> In your particular local::lib case, you had a maintainer failure, 
            and I think that needs to be dealt in human terms.
12:28 <xdg> My old company talked about "culture vs controls".  I think you (or 
            mst, I guess) need to consider how to treat a cultural breakdown.
12:30 <xdg> Separately, if local::lib used dzil for managing releases (even 
            with its own custom Makefile.PL) then it would be easier to 
            automate checking for dirty files or ensuring a tag/push happening 
            on release
12:47 <ether> here
12:47 <ether> yes, the local::lib problem was clearly human error
12:48 <ether> (I blame that we were using M::I for that dist, rather than dzil, 
              as my normal set of plugins both push after release, and also 
              check the remote for unpulled commits before release) :)
12:48 <ether> Distar also has both things in its flow too
12:49 <ether> but that aside, both apeiron and I were surprised that the 
              "duplicate" upload was allowed without error
12:49 <ether> it might have been useful if PAUSE had said something about this 
              in its email response, just as a heads up
12:50 <ether> "indexed local::lib 1.23; NOTE previous indexed version (uploaded 
              2013-02-xx via APEIRON/local-lib-1.23.tar.gz) also at version 
              1.23"
12:50 <ether> that would have clued me into there being an error
12:51 <ether> i.e. "we allow this, but maybe it's not what you wanted"
12:52 <ether> with your permission, I'd like to share this discussion with 
              apeiron, and whoever else expresses interest in the current PAUSE 
              situation?
12:55 <ether> which might just mean "pasted into a PAUSE ticket I haven't yet 
              filed"
13:21 <xdg> I see no problem with a "whoa, maybe this was an error" warning 
            from PAUSE, but I'm not sure the right heuristics.
13:22 <xdg> I'd open a ticket and hilight @dagolden and @rjbs and we can debate 
            heuristics there
13:26 <ether> may I include this conversation as context?
13:27 <xdg> sure

@xdg @rjbs @apeiron

dagolden commented 11 years ago

To clarify, here are some cases to consider:

Which of these should be detected as problematic? And which are easy to detect as problematic.

karenetheridge commented 11 years ago

I'd handle all these cases the same way: PAUSE could simply warn that one or more contained modules in the new upload have entered the index at the same $VERSION as was indexed previously. The user should be able to apply his own heuristics for his version numbering scheme and distname to quickly determine whether this was a mistake of some kind -- this obviates having to compare .pm content or uploaded dist names.

ghost commented 11 years ago

On Tue, May 28, 2013 at 14:19:35 -0700 , Karen Etheridge wrote:

I'd handle all these cases the same way: PAUSE could simply warn that one or more contained modules in the new upload have entered the index at the same $VERSION as was indexed previously. The should be able to apply his own heuristics for his version numbering scheme and distname to quickly determine whether this was a mistake of some kind -- this obviates having to compare .pm content or uploaded dist names.

+1

karenetheridge commented 11 years ago

The should

"The user should" -- I accidentally a word :)

dagolden commented 11 years ago

I'd be careful about that -- many people do that intentionally, I think. E.g Foo-Bar-1.23 contains Foo::Bar at 1.23 and Foo::Baz at 1.23. Bar.pm gets modified and bumped to 1.24, but Baz stays at 1.23. Then Foo-Bar-1.24 gets shipped with Foo::Bar 1.24 and Foo::Baz 1.23.

This is one of the ways that people who manage $VERSION manually do it -- they only bump $VERSION in some files. A "warning" is likely to either piss people off or get ignored.

I'm not opposed to adding some information. ("Foo::Baz 1.23 entered the index. Previous was 1.23") But I wonder whether you would have even noticed. (I tend to ignore the text of those emails.)

I think better heuristics would make more sense if you really were going to do this. E.g. capturing an MD5 of the .pm files and warning if the MD5 changes but the version stays the same.

karenetheridge commented 11 years ago

On Tue, May 28, 2013 at 03:44:55PM -0700, David Golden wrote:

I'd be careful about that -- many people do that intentionally, I think. E.g Foo-Bar-1.23 contains Foo::Bar at 1.23 and Foo::Baz at 1.23. Bar.pm gets modified and bumped to 1.24, but Baz stays at 1.23. Then Foo-Bar-1.24 gets shipped with Foo::Bar 1.24 and Foo::Baz 1.23.

Agreed. I have no idea how many authors or dists use this process though -- I wonder if anyone has run any stats on it? I suspect it's less common than only bumping the version of the "main" module and leaving everything else unversioned.

This is one of the ways that people who manage $VERSION manually do it -- they only bump $VERSION in some files. A "warning" is likely to either piss people off or get ignored.

I'm not opposed to adding some information. ("Foo::Baz 1.23 entered the index. Previous was 1.23") But I wonder whether you would have even noticed. (I tend to ignore the text of those emails.)

I certainly would - I check the content of all my PAUSE receipts, as it's not uncommon for me to have been "gifted" with comaint on one module in a dist but the others were missed, so I have to keep an eye out for indexing errors. But again, I have no idea how normal this is :)

I think better heuristics would make more sense if you really were going to do this. E.g. capturing an MD5 of the .pm files and warning if the MD5 changes but the version stays the same.

Fair enough, and that's a very easy mechanism to implement.