Carp is missing a dot - Githubissues

Perl / perl5

🐪 The Perl programming language

https://dev.perl.org/perl5/

Other

1.86k stars 529 forks source link

Carp is missing a dot #11814

Closed p5pRT closed 12 years ago

p5pRT commented 12 years ago

Migrated from rt.perl.org#106538 (status was 'resolved')

Searchable as RT106538$

p5pRT commented 12 years ago

From alex.hartmaier@gmail.com

Maybe I wasn't clear enough: if blead breaks some module on CPAN that's 'fine'. But the problem wasn't blead but releasing a new version of Carp to CPAN which broke all stable Perl version installations.

I don't see how a blead smoke prevents that from happening for a dual-lifed module. A warning in the Carp changelog would have helped (I read all changelogs before upgrading modules) if it was known at the time of release. This kind of problem could only be solved by rerunning the tests of all downstream dependencies of a module after it has been updated which would require the test suites installed or redownloaded from CPAN.

-Alex (abraxxa)

On Wed\, Feb 29\, 2012 at 4:46 PM\, George Greer \perl@greerga\.m\-l\.org wrote:

On Wed\, 29 Feb 2012\, Nicholas Clark wrote:

The thing that *would* have got it is a "build all of CPAN"\, which we

don't have the resources to do for every change. So it gets down to - how do we spot which changes need that sort of thing?

If I can get a copy of the "build all of CPAN" script that the comparisons from a while ago were done with\, I can try to set it up on a rolling basis. That is\, it would compare blead from the last pass with whatever was blead at the start of the next pass. So it wouldn't be every change but it would at least be more than never.

Should then *all* changes go through a smoke-me change? I treat all of

I'd actually be happy if (nearly) all non-doc changes *did*. But it wouldn't have caught this one.

(We'd need to improve the smoke-me reporting infrastructure to make this useful though)

What's on your wishlist?

-- George Greer

p5pRT commented 12 years ago

From zefram@fysh.org

Alexander Hartmaier wrote:

But the problem wasn't blead but releasing a new version of Carp to CPAN which broke all stable Perl version installations.

Interesting. It doesn't break *all* installations\, but only those that upgrade Carp from CPAN. Since nothing at all declares a recent Carp as a versioned dependency (because Carp was only recently dual-lifed)\, you'll only get a Carp upgrade if you specifically choose to upgrade it or you blanket upgrade everything. Presumably that's what you're doing to have run into it. We have some blanket rebuilding of dependencies at my work\, and ran into Carp breaking tests for an internal module (that I'd written). I wonder how common a practice it is to systematically upgrade everything.

Some of the recent dual-life Carp versions have had more or less subtle portability problems for Perl 5.6. I didn't get any bug reports about this -- just ran into the problems myself in my routine CPAN testing -- so I presume that 5.6 users\, at least\, are not blanket upgrading.

I don't see how a blead smoke prevents that from happening for a dual-lifed module.

A smoke test of blead *against CPAN* (not just against the blead test suite) detects exactly this situation. Indeed\, we did come up with a list of affected CPAN modules\, before perl-5.15.8 and Carp-1.25 went out. If it had been a CPAN-only module then it could not have been tackled by any kind of blead smoke.

A warning in the Carp changelog would have helped

The change was noted\, but not explicitly marked as significant. Would a "significant change:" label on that entry have caught your attention sufficiently? I've used "incompatible change:" a couple of times\, but that doesn't seem an appropriate categorisation for this change.

-zefram

p5pRT commented 12 years ago

From @tsee

On 02/29/2012 11:57 AM\, Nicholas Clark wrote:

The thing that *would* have got it is a "build all of CPAN"\, which we don't have the resources to do for every change. So it gets down to - how do we spot which changes need that sort of thing?

It's not just that. While I did expend the effort to write tools for CPAN smoking (hacky\, yeah\, I'll admit that) which have step-by-step instructions\, and are all set up on a machine that all committers have access to\, I'm still the only one who's ever volunteered to use it to do CPAN smokes.

I suppose I could fill the gap between "run the following couple of commands\, wait\, look at HTML report" to get "click on the following button on a web page\, wait\, look at HTML report". But I doubt THAT level of automation is going to help very much. :(

I think the key thing that was missing here was realising that this change broke the CPAN smokers *reporting* setup. Which would have cut off reports\, and hence hidden the problems.

Regular automated smoke testing to be sure that the reporting setup isn't broken seems like the first thing to fix. Unlike "all of CPAN"\, doing that on a fast cycle seems tractable.

Yes\, but requires a fair amount of effort to set up and maintain. That's not free\, particularly\, since it'll be one of the people doing it that are otherwise spending their time on *different* useful stuff

--Steffen

p5pRT commented 12 years ago

From @tsee

On 02/29/2012 03:26 PM\, Ricardo Signes wrote:

* Zefram [2012-02-29T06:40:12]

Not quite the right criterion. We *did* know that this change would break some proportion of CPAN\, and that a full-CPAN smoke test would be useful. This is not the case for most core changes\, contentious or otherwise. We could (and I won't dispute that we should) have run the full-CPAN smoke test with the change parked on a branch\, rather than putting it into blead first.

Well\, we *did* run it through a CPAN smoke. The problem was not looking at the list of affected dists\, noting which were dependencies for other libraries (especially significant trees)\, fixing those\, and then re-smoking.

The CPAN smoke report has this data. In all fairness\, it doesn't scream at you about it. :)

* Zefram [2012-02-29T06:40:12]

What would have made a difference here\, and in similar situations\, is an easy way to invoke the full-CPAN smoke test. We have smoke-me branches for testing a change across architectures and build options; I'd like to see a cpan-smoke-me mechanism for testing a change across CPAN distros.

I agree wholeheartedly. It's a lot of work to make this happen\, but it would be phenomenally useful\, even if it was limited to one architecture and build. Especially if the mechanism was easy to re-use as more computing resources became available.

I think I'd estimate this automation to be two full days of work to get something hecked up. AFTER understanding what's there. I'm not going to do it. And this all doesn't include the rabbit whole of understanding and working with the issues of properly cleaning up after the smokes\, dealing with limited CPU time\, dealing with limited scratch disk\, etc.

Doesn't take a rocket scientist\, but a fair bit of toolchain clue\, persistence and access to the system in question.

* Zefram [2012-02-29T06:40:12]

Overall\, I don't think we totally bungled this\, as Andreas surmised. We anticipated pain\, and we took steps to measure and manage that pain\, albeit imperfectly. It's a could-do-better.

This is my feeling\, too. I made a judgment call\, and in hindsight\, I would have made it differently. Frustratingly\, I believe I had all the information I needed to make the right one\, and simply didn't put it all together at the time.

But I don't plan to fall on my sword over this. I just plan to keep it in mind at similar junctures in the future.

Agreed.

--Steffen

p5pRT commented 12 years ago

From @tsee

On 02/29/2012 12:11 PM\, Chris Prather wrote:

You yourself have pointed out in other threads that even *doc* changes can affect things like diagnostics.pm. Going down this road leads to a place where we'll get lost in the intractable problems which is why I wanted to cut it off early. I would love to have an infrastructure where we could smoke every commit against all of CPAN. The things we could do with such a tool would be amazing! Sadly we're not there yet.

It takes a Google-alike infrastructure to do that. Not just a couple of machines.

So no\, not going to get there. It'll need a bit of a judgement call.

Also\, we'd not only want to smoke each commit against its predecessor\, but possibly also across wider ranges. Ouch. Combinatorial hell. :)

--Steffen

p5pRT commented 12 years ago

From @doy

On Wed\, Feb 29\, 2012 at 09:41:40PM +0100\, Steffen Mueller wrote:

On 02/29/2012 03:26 PM\, Ricardo Signes wrote:

* Zefram [2012-02-29T06:40:12]

What would have made a difference here\, and in similar situations\, is an easy way to invoke the full-CPAN smoke test. We have smoke-me branches for testing a change across architectures and build options; I'd like to see a cpan-smoke-me mechanism for testing a change across CPAN distros.

I agree wholeheartedly. It's a lot of work to make this happen\, but it would be phenomenally useful\, even if it was limited to one architecture and build. Especially if the mechanism was easy to re-use as more computing resources became available.

I think I'd estimate this automation to be two full days of work to get something hecked up. AFTER understanding what's there. I'm not going to do it. And this all doesn't include the rabbit whole of understanding and working with the issues of properly cleaning up after the smokes\, dealing with limited CPU time\, dealing with limited scratch disk\, etc.

Doesn't take a rocket scientist\, but a fair bit of toolchain clue\, persistence and access to the system in question.

How much effort would it be to set this up on another machine? How much resources does it take?

-doy

p5pRT commented 12 years ago

From @tsee

On 02/29/2012 09:44 PM\, Jesse Luehrs wrote:

On Wed\, Feb 29\, 2012 at 09:41:40PM +0100\, Steffen Mueller wrote:

On 02/29/2012 03:26 PM\, Ricardo Signes wrote:

* Zefram [2012-02-29T06:40:12]

What would have made a difference here\, and in similar situations\, is an easy way to invoke the full-CPAN smoke test. We have smoke-me branches for testing a change across architectures and build options; I'd like to see a cpan-smoke-me mechanism for testing a change across CPAN distros.

I agree wholeheartedly. It's a lot of work to make this happen\, but it would be phenomenally useful\, even if it was limited to one architecture and build. Especially if the mechanism was easy to re-use as more computing resources became available.

I think I'd estimate this automation to be two full days of work to get something hecked up. AFTER understanding what's there. I'm not going to do it. And this all doesn't include the rabbit whole of understanding and working with the issues of properly cleaning up after the smokes\, dealing with limited CPU time\, dealing with limited scratch disk\, etc.

Doesn't take a rocket scientist\, but a fair bit of toolchain clue\, persistence and access to the system in question.

How much effort would it be to set this up on another machine? How much resources does it take?

Effort: It's a bit fiddly\, but I'd say setting up another takes no more than a couple of hours once you got the idea. Making a single smoke span across multiple machines (if you don't want to smoke multiple commits separately)? Lots of effort in getting identical machines and quite some engineering effort to make the logic work. But at least the latter is fun.

CPU? However much you can throw at it. Disk? Proportional to how much CPUs we have\, since there's a fair bit of useless re-smoking that could be optimized in the underlying tools. I didn't have time and clue to do that\, though.

The current CPAN smoker runs for a couple of days (not quite a week\, IIRC). It's running on a 100GB NFS share\, taking a guesstimate of 2/3 of that for the active smoke. No\, it's not disk-space efficient. The machine is a "Intel(R) Xeon(R) CPU E5420 @ 2.50GHz"\, which is a quad-core-eight-thread single-socket system. RAM is not an issue\, but this one has 32GB. Disk cache probably helps a fair bit. So this isn't top of the line\, but neither your mum's virtual server.

This all being said\, I'm perfectly willing to spend some time with you to get you started in improving my Jenga of hacks AS WELL AS giving hints about the practical setup. I believe you already have commit priviledges\, so you can log in to "llama" from camel or dromedary. I think we (p5p) have more machines available for use by smokers. But for maximum benefit\, that'll require some thought in scaling beyond the one.

Cheers\, Steffen

p5pRT commented 12 years ago

From @tsee

On 02/29/2012 09:44 PM\, Jesse Luehrs wrote:

I think I'd estimate this automation to be two full days of work to get something hecked up. AFTER understanding what's there. I'm not going to do it. And this all doesn't include the rabbit whole of understanding and working with the issues of properly cleaning up after the smokes\, dealing with limited CPU time\, dealing with limited scratch disk\, etc.

Doesn't take a rocket scientist\, but a fair bit of toolchain clue\, persistence and access to the system in question.

How much effort would it be to set this up on another machine? How much resources does it take?

Rats\, forgot the link:

https://github.com/tsee/cpan_perl_branch_smoke

--Steffen

p5pRT commented 12 years ago

From nick@nickandperla.net

On Wed\, 29 Feb 2012 11:27:46 +0000 Nicholas Clark \nick@ccl4\.org wrote:

On Wed\, Feb 29\, 2012 at 06:11:49AM -0500\, Chris Prather wrote:

On Wed\, Feb 29\, 2012 at 5:57 AM\, Nicholas Clark \nick@ccl4\.org wrote:

[ignoring all other details\, one point below]

[ditto]

Putting it in a smoke-me branch would not have helped. That tests building the core with different configurations on different platforms.

All tests passed for all core modules. That's all it can tell.

The thing that *would* have got it is a "build all of CPAN"\, which we don't have the resources to do for every change. So it gets down to - how do we spot which changes need that sort of thing?

If it's not clear from context\, I meant "smoke-me" as a substitute for 'Enhanced Smoking Techniques' we can apply to a branch. It's my fault for mis-using a term in an imprecise fashion.

It wasn't clear. I think that's obvious in hindsight :-)

smoke-me specifically means platform testing from branches of git://perl5.git.perl.org/perl.git BBC specifically means "Bleadperl Breaks CPAN" - Andreas' testing setup for trying to build CPAN against a current(ish) build\, and then (*very* usefully) trying to bisect down to the particular commit that broke things (which I think is only on x86_64 Linux\, but I might be wrong)

keep those terms for those things.

everything else doesn't exist yet.

Should then *all* changes go through a smoke-me change? I treat all of

I'd actually be happy if (nearly) all non-doc changes *did*. But it wouldn't have caught this one.

(We'd need to improve the smoke-me reporting infrastructure to make this useful though)

You yourself have pointed out in other threads that even *doc* changes can affect things like diagnostics.pm. Going down this road leads to a

Sure\, but they are not platform or configuration dependent. If people run the tests before they commit\, such problems are spotted early. (Otherwise Jenkins will waggle a finger at them in public. The shame...)

place where we'll get lost in the intractable problems which is why I wanted to cut it off early. I would love to have an infrastructure where we could smoke every commit against all of CPAN. The things we could do with such a tool would be amazing! Sadly we're not there yet.

Which I think means that it's not as bad as that.

I think the key thing that was missing here was realising that this change broke the CPAN smokers *reporting* setup. Which would have cut off reports\, and hence hidden the problems.

Having a failure mode that hides the complexity of the problem is a double whammy. Having a process to define where we need more canaries would be useful too.

Regular automated smoke testing to be sure that the reporting setup isn't broken seems like the first thing to fix. Unlike "all of CPAN"\, doing that on a fast cycle seems tractable.

I'm certainly willing to help figure out how to make it happen.

That would be cool. I don't really think that I have the time to do more than this brain dump:

It (first) struck me as something that could be automated with Jenkins. But I'm not *sure*. Does

a) Jenkins maintain state? b) does it have a concept of a skip?

In that\, we need to avoid spamming with false positives. What *we* here are interested in is whether a change to blead broke the reporter modules. We don't want to confuse that with the modules themselves breaking. And it's not *that* easy to control the local CPAN mirror updates. So\, probably one wants to do this:

0) snapshot CPAN. (eg\, private local copy of CPAN\, updated by this build script) 1) build blead with *parent of current commit* (5 minutes on fast hardware in parallel) 2) build reporter toolchain a) if it fails at this point\, bail out as a "skip". (Warn about this?) b) otherwise continue 3) clean everything 4) build blead with current commit 5) build reporter toolchain (identical CPAN)

and that's then "pass" or "fail"\, and conclusively did *this* blead commit break things

(for the tested platform\, which likely is the easy 90% to get right for starters)

I also wonder whether anyone at the QA hackathon currently projectless would find this one interesting.

Color me interest-piqued. I am not familiar enough with the entire toolchain but with a mentor\, I could probably spend some time hacking on this.

Nicholas Clark

Nicholas Perez XMPP/Email: nick@nickandperla.net https://metacpan.org/author/NPEREZ http://github.com/nperez

p5pRT commented 12 years ago

From @nwc10

On Wed\, Feb 29\, 2012 at 10:46:43AM -0500\, George Greer wrote:

On Wed\, 29 Feb 2012\, Nicholas Clark wrote:

The thing that *would* have got it is a "build all of CPAN"\, which we don't have the resources to do for every change. So it gets down to - how do we spot which changes need that sort of thing?

If I can get a copy of the "build all of CPAN" script that the comparisons from a while ago were done with\, I can try to set it up on a rolling basis. That is\, it would compare blead from the last pass with whatever was blead at the start of the next pass. So it wouldn't be every change but it would at least be more than never.

I think you'd need to ask Steffen where that is

Should then *all* changes go through a smoke-me change? I treat all of

I'd actually be happy if (nearly) all non-doc changes *did*. But it wouldn't have caught this one.

(We'd need to improve the smoke-me reporting infrastructure to make this useful though)

What's on your wishlist?

I fear that this isn't complete\, as I think I've forgotten something.

It's partly that (as best I can tell) the code you're running locally has diverged from the "upstream" code\, so it's unclear whether bugs fixes and other improvements are getting made in more than one place\, which is a duplication of effort.

In particular\, I'd like everyone else to run your code\, because of a couple of minor things:

* the subject using the branch name is terser Smoke [blead] v5.15.9-20-g15d94df vs the tautological Smoke [5.15.9] v5.15.9-20-g15d94df Also those 1 or 2 characters can make a difference when the most important bit is actually the detail of PASS(...) or FAIL(...)\, which can fall off the right

* the smoke-me code mails me directly if the branch fails * I can get the logs

but also I'd like a couple of visibility bugs in your setup to be fixed:

8 results vs 6 annotations\, or 4 results vs 2 annotations:

O F F F O F F F -Duseithreads | +--------- -DDEBUGGING +----------- no debugging

after which I guess that there are more general skimming issues with the smoke output. I'm familiar with it\, and I find it easy to read\, but others are not paying attention to the smoke output because they perceive it as impenetrable noise. It would be useful force those people to explain what they find most obnoxious about it\, fix that\, iterate until they run out of complaints.

First off\, I'm not sure whether the line "Summary: PASS" (or FAIL...) should be the first line\, with 2 (or 3) blank lines beneath it. But I'm not the target for such improvements - really the monthly release managers are the people whose input we should be getting.

Nicholas Clark

p5pRT commented 12 years ago

From @nwc10

On Mon\, Mar 26\, 2012 at 05:12:55PM +0100\, Nicholas Clark wrote:

On Wed\, Feb 29\, 2012 at 10:46:43AM -0500\, George Greer wrote:

On Wed\, 29 Feb 2012\, Nicholas Clark wrote:

The thing that *would* have got it is a "build all of CPAN"\, which we don't have the resources to do for every change. So it gets down to - how do we spot which changes need that sort of thing?

If I can get a copy of the "build all of CPAN" script that the comparisons from a while ago were done with\, I can try to set it up on a rolling basis. That is\, it would compare blead from the last pass with whatever was blead at the start of the next pass. So it wouldn't be every change but it would at least be more than never.

I think you'd need to ask Steffen where that is

Oops\, I don't have an alias for Steffen as "steffen". Correct cc now there.

Clearly "spell checker" is not the only check I need before hitting "send"

Nicholas Clark

p5pRT commented 12 years ago

From @nwc10

On Wed\, Feb 29\, 2012 at 09:26:37AM -0500\, Ricardo Signes wrote:

As I said much earlier\, I thought we'd be able to revert the change if we determined it was a bigger problem than we'd expected. Now I'm concerned that this is not true\, because somehow this got propagated as a way to fix the problem:

$pattern .= $Carp::VERSION gt "1.24" ? "." :"";

...which means that a reversion in 1.30 would require all the libraries that were fixed in this way to be fixed *again*!

Sigh.

Yes\, encoding a new golden result is the short-term easiest fix. But it's not a *good* idea.

* Zefram [2012-02-29T06:40:12]

Overall\, I don't think we totally bungled this\, as Andreas surmised. We anticipated pain\, and we took steps to measure and manage that pain\, albeit imperfectly. It's a could-do-better.

This is my feeling\, too. I made a judgment call\, and in hindsight\, I would have made it differently. Frustratingly\, I believe I had all the information I needed to make the right one\, and simply didn't put it all together at the time.

But I don't plan to fall on my sword over this. I just plan to keep it in mind at similar junctures in the future.

Including being in-your-face clear about what the right and wrong way to fix problems is\, so that our path to reversing changes is clear and up front and if people are\, um\, *still* unwise enough to ignore it\, "your problem"?

It can be really quite frustrating how our maneuverability can be thwarted by poor decisions of CPAN code. I'm sticking to "poor" because it *should* be apparent to anyone with time to think\, the implications of such a choice.

Nicholas Clark

p5pRT commented 12 years ago

From @tsee

On 03/26/2012 06:15 PM\, Nicholas Clark wrote:

On Mon\, Mar 26\, 2012 at 05:12:55PM +0100\, Nicholas Clark wrote:

On Wed\, Feb 29\, 2012 at 10:46:43AM -0500\, George Greer wrote:

On Wed\, 29 Feb 2012\, Nicholas Clark wrote:

The thing that *would* have got it is a "build all of CPAN"\, which we don't have the resources to do for every change. So it gets down to - how do we spot which changes need that sort of thing?

If I can get a copy of the "build all of CPAN" script that the comparisons from a while ago were done with\, I can try to set it up on a rolling basis. That is\, it would compare blead from the last pass with whatever was blead at the start of the next pass. So it wouldn't be every change but it would at least be more than never.

I think you'd need to ask Steffen where that is

Oops\, I don't have an alias for Steffen as "steffen". Correct cc now there.

Clearly "spell checker" is not the only check I need before hitting "send"

It's here:

https://github.com/tsee/cpan_perl_branch_smoke

The README should mostly have step-by-step instructions. There's a whole bunch of gotchas such as requiring a fair amount of disk (in particular when running multiple processes for each commit). The other one is that the tempdir setting seems to be ignored by CPANPLUS. Whether that's due to CPANPLUS or the surrounding script I did not bother to find out. I just used TMPDIR=/foo/bar when kicking off the actual smokers. ...

Let me know if you have questions.

Best regards\, Steffen

p5pRT commented 12 years ago

From @doy

On Mon\, Mar 26\, 2012 at 05:12:55PM +0100\, Nicholas Clark wrote:

Also those 1 or 2 characters can make a difference when the most important bit is actually the detail of PASS(...) or FAIL(...)\, which can fall off the right

Then wouldn't it be even more useful to have the PASS(...)/FAIL(...) bit first?

-doy

p5pRT commented 12 years ago

From @tux

On Mon\, 26 Mar 2012 17:12:55 +0100\, Nicholas Clark \nick@ccl4\.org wrote:

On Wed\, Feb 29\, 2012 at 10:46:43AM -0500\, George Greer wrote:

On Wed\, 29 Feb 2012\, Nicholas Clark wrote:

The thing that *would* have got it is a "build all of CPAN"\, which we don't have the resources to do for every change. So it gets down to - how do we spot which changes need that sort of thing?

If I can get a copy of the "build all of CPAN" script that the comparisons from a while ago were done with\, I can try to set it up on a rolling basis. That is\, it would compare blead from the last pass with whatever was blead at the start of the next pass. So it wouldn't be every change but it would at least be more than never.

I think you'd need to ask Steffen where that is

Should then *all* changes go through a smoke-me change? I treat all of

I'd actually be happy if (nearly) all non-doc changes *did*. But it wouldn't have caught this one.

(We'd need to improve the smoke-me reporting infrastructure to make this useful though)

What's on your wishlist?

I fear that this isn't complete\, as I think I've forgotten something.

It's partly that (as best I can tell) the code you're running locally has diverged from the "upstream" code\, so it's unclear whether bugs fixes and other improvements are getting made in more than one place\, which is a duplication of effort.

In particular\, I'd like everyone else to run your code\, because of a couple of minor things:

* the subject using the branch name is terser Smoke [blead] v5.15.9-20-g15d94df vs the tautological Smoke [5.15.9] v5.15.9-20-g15d94df Also those 1 or 2 characters can make a difference when the most important bit is actually the detail of PASS(...) or FAIL(...)\, which can fall off the right

We can change that (too)

* the smoke-me code mails me directly if the branch fails

That might be amongst several new options

* I can get the logs

That will probably be possible too in the new setup. The QAH is just a few days away

-- H.Merijn Brand http://tux.nl Perl Monger http://amsterdam.pm.org/ using perl5.00307 .. 5.14 porting perl5 on HP-UX\, AIX\, and openSUSE http://mirrors.develooper.com/hpux/ http://www.test-smoke.org/ http://qa.perl.org http://www.goldmark.org/jeff/stupid-disclaimers/

p5pRT commented 12 years ago

From @nwc10

On Mon\, Mar 26\, 2012 at 11:39:52AM -0500\, Jesse Luehrs wrote:

On Mon\, Mar 26\, 2012 at 05:12:55PM +0100\, Nicholas Clark wrote:

Also those 1 or 2 characters can make a difference when the most important bit is actually the detail of PASS(...) or FAIL(...)\, which can fall off the right

Then wouldn't it be even more useful to have the PASS(...)/FAIL(...) bit first?

I thought about that\, but forgot to say that that makes it hard to sort by subject in a mail client to get all results for the same platform together. Right now\, sorting by subject is a fast way to determine when a platform started to fail (or pass)

Nicholas Clark

p5pRT commented 12 years ago

From @nwc10

On Mon\, Mar 26\, 2012 at 06:52:12PM +0200\, H.Merijn Brand wrote:

On Mon\, 26 Mar 2012 17:12:55 +0100\, Nicholas Clark \nick@ccl4\.org wrote:

* the subject using the branch name is terser Smoke [blead] v5.15.9-20-g15d94df vs the tautological Smoke [5.15.9] v5.15.9-20-g15d94df Also those 1 or 2 characters can make a difference when the most important bit is actually the detail of PASS(...) or FAIL(...)\, which can fall off the right

We can change that (too)

* the smoke-me code mails me directly if the branch fails

That might be amongst several new options

* I can get the logs

That will probably be possible too in the new setup. The QAH is just a few days away

This would all be cool. It's just is (probably) faster to integrate his code than re-implement it.

Its your valuable Parisian drinking time I care for. Honest :-)

Nicholas Clark

p5pRT commented 12 years ago

From @greerga

(due to an ADSL outage this is copy/pasted from the web archives)

On Wed\, Feb 29\, 2012 at 10:46:43AM -0500\, George Greer wrote:

What's on your wishlist?

I fear that this isn't complete\, as I think I've forgotten something.

It's partly that (as best I can tell) the code you're running locally has diverged from the "upstream" code\, so it's unclear whether bugs fixes and other improvements are getting made in more than one place\, which is a duplication of effort.

My Test::Smoke customizations are:

1. Run 'make minitest' in addition to other tests. (Although I never did investigate how to get the report matrix to add that as a column...)

2. I moved the "user_note" to the very top of the reports instead of the bottom. (Currently used for URL to reports but also soon a disclaimer about my Win32 VM's propensity to have timing issues.)

3. Also allow ".config" suffix on configurations. (Previously "_config".)

4. Get version (what is between the brackets in smoke email subject line) from $ENV{TEST_SMOKE_BRANCH}\, if present\, instead of repeating Perl version. (This is what makes the "Smoke [blead]" in my reports.)

In particular\, I'd like everyone else to run your code\, because of a couple of minor things:

* the subject using the branch name is terser Smoke [blead] v5.15.9-20-g15d94df vs the tautological Smoke [5.15.9] v5.15.9-20-g15d94df Also those 1 or 2 characters can make a difference when the most important bit is actually the detail of PASS(...) or FAIL(...)\, which can fall off the right

I thought that was rather silly too. I didn't add that to Test::Smoke in the best way though (via parsing ".patch") since my "smoke-me" script already adds the branch name as an environment variable and so that was fastest for tuit use.

* the smoke-me code mails me directly if the branch fails

That's a function of the 'smoke-me' script\, not Test::Smoke\, although I agree it is a necessity for a 'smoke-me' service to do so. It's an unconditional email though\, not just failures.

My configuration for that is:

driver/smoke-me_clang_quick.config.template: 'to' => 'smokers-reports@perl.org\,%COMMITTER_EMAIL%'\,

and the 'smoke-me' script does some variable replacements before making the .config file for Test::Smoke.

(For those who may not know\, the smoke-me script: https://github.com/greerga/smoke-me/ )

* I can get the logs

True\, that's necessary for any 'smoke-me' service\, Although the dashboard Tony Cook has is even better. (http://perl.develop-help.com/reports/)

I wonder if he has that on github...

but also I'd like a couple of visibility bugs in your setup to be fixed:

X X O X X X O X -Uusenm -Duseithreads -Dmad | | | | | +- LC_ALL = en_US.utf8 -DDEBUGGING | | | | +--- PERLIO = perlio -DDEBUGGING | | | +----- PERLIO = stdio -DDEBUGGING | | +------- LC_ALL = en_US.utf8 | +--------- PERLIO = perlio +----------- PERLIO = stdio

8 results vs 6 annotations\, or 4 results vs 2 annotations:

O F F F O F F F -Duseithreads | +--------- -DDEBUGGING +----------- no debugging

Yes\, my adding 'minitest' to all runs did that. I do need to fix that.

after which I guess that there are more general skimming issues with the smoke output. I'm familiar with it\, and I find it easy to read\, but others are not paying attention to the smoke output because they perceive it as impenetrable noise. It would be useful force those people to explain what they find most obnoxious about it\, fix that\, iterate until they run out of complaints.

I agree. My biggest wishlist for Test::Smoke is (automatically) keeping track of when a test first started failing so it can differentiate between "failure" and "known failure"\, and while we're at it "sporadic failure" would be nice. I find the Test::Smoke code...dense...but then I've mostly only had time for drive-by changes to it and not been able to sit down and follow its flow.

First off\, I'm not sure whether the line "Summary: PASS" (or FAIL...) should be the first line\, with 2 (or 3) blank lines beneath it. But I'm not the target for such improvements - really the monthly release managers are the people whose input we should be getting.

There was a paucity of replies to this particular thread so we may never know.

-- George Greer

p5pRT commented 12 years ago

From @tux

On Wed\, 4 Apr 2012 21:02:07 -0400 (EDT)\, George Greer \perl@greerga\.m\-l\.org wrote:

(due to an ADSL outage this is copy/pasted from the web archives)

On Wed\, Feb 29\, 2012 at 10:46:43AM -0500\, George Greer wrote:

What's on your wishlist?

I fear that this isn't complete\, as I think I've forgotten something.

It's partly that (as best I can tell) the code you're running locally has diverged from the "upstream" code\, so it's unclear whether bugs fixes and other improvements are getting made in more than one place\, which is a duplication of effort.

My Test::Smoke customizations are:

1. Run 'make minitest' in addition to other tests.

If others find that useful too\, we should make that optional.

(Although I never did investigate how to get the report matrix to add that as a column...)

That should have been solved in the new setup

2. I moved the "user_note" to the very top of the reports instead of the bottom. (Currently used for URL to reports but also soon a disclaimer about my Win32 VM's propensity to have timing issues.)

That is in a template in the new version\, moving it to other places is easy\, but maybe not even necessary in the new database face. Our aim is to not send reports to a list at all (you can still send mail to yourself to get a copy).

3. Also allow ".config" suffix on configurations. (Previously "_config".)

Abe?

4. Get version (what is between the brackets in smoke email subject line) from $ENV{TEST_SMOKE_BRANCH}\, if present\, instead of repeating Perl version. (This is what makes the "Smoke [blead]" in my reports.)

As no mails will be send\, this is moot

In particular\, I'd like everyone else to run your code\, because of a couple of minor things:

* the subject using the branch name is terser Smoke [blead] v5.15.9-20-g15d94df vs the tautological Smoke [5.15.9] v5.15.9-20-g15d94df Also those 1 or 2 characters can make a difference when the most important bit is actually the detail of PASS(...) or FAIL(...)\, which can fall off the right

I thought that was rather silly too. I didn't add that to Test::Smoke in the best way though (via parsing ".patch") since my "smoke-me" script already adds the branch name as an environment variable and so that was fastest for tuit use.

* the smoke-me code mails me directly if the branch fails

That's a function of the 'smoke-me' script\, not Test::Smoke\, although I agree it is a necessity for a 'smoke-me' service to do so. It's an unconditional email though\, not just failures.

My configuration for that is:

driver/smoke-me_clang_quick.config.template: 'to' => 'smokers-reports@perl.org\,%COMMITTER_EMAIL%'\,

and the 'smoke-me' script does some variable replacements before making the .config file for Test::Smoke.

(For those who may not know\, the smoke-me script: https://github.com/greerga/smoke-me/ )

* I can get the logs

In the new setup\, logs are being sent to the database is the final status was not "PASS". This is configurable.

True\, that's necessary for any 'smoke-me' service\, Although the dashboard Tony Cook has is even better. (http://perl.develop-help.com/reports/)

I wonder if he has that on github...

but also I'd like a couple of visibility bugs in your setup to be fixed:

X X O X X X O X -Uusenm -Duseithreads -Dmad | | | | | +- LC_ALL = en_US.utf8 -DDEBUGGING | | | | +--- PERLIO = perlio -DDEBUGGING | | | +----- PERLIO = stdio -DDEBUGGING | | +------- LC_ALL = en_US.utf8 | +--------- PERLIO = perlio +----------- PERLIO = stdio

8 results vs 6 annotations\, or 4 results vs 2 annotations:

O F F F O F F F -Duseithreads | +--------- -DDEBUGGING +----------- no debugging

Yes\, my adding 'minitest' to all runs did that. I do need to fix that.

after which I guess that there are more general skimming issues with the smoke output. I'm familiar with it\, and I find it easy to read\, but others are not paying attention to the smoke output because they perceive it as impenetrable noise. It would be useful force those people to explain what they find most obnoxious about it\, fix that\, iterate until they run out of complaints.

I agree. My biggest wishlist for Test::Smoke is (automatically) keeping track of when a test first started failing so it can differentiate between

The new setup registers start time of every smoke-configuration subset and the duration thereof

"failure" and "known failure"\, and while we're at it "sporadic failure" would be nice. I find the Test::Smoke code...dense...but then I've mostly

Sporadic failures should now be detectable\, as we store failures per test file\, so you could make a trendline for e.g. op/read.t

only had time for drive-by changes to it and not been able to sit down and follow its flow.

First off\, I'm not sure whether the line "Summary: PASS" (or FAIL...) should be the first line\, with 2 (or 3) blank lines beneath it. But I'm not the target for such improvements - really the monthly release managers are the people whose input we should be getting.

There was a paucity of replies to this particular thread so we may never know.

We will be on #smoke in irc.perl.org to discuss wishes (when possible)

p5pRT commented 12 years ago

From @nwc10

On Thu\, Apr 05\, 2012 at 09:38:15AM +0200\, H.Merijn Brand wrote:

On Wed\, 4 Apr 2012 21:02:07 -0400 (EDT)\, George Greer \perl@greerga\.m\-l\.org wrote:

(due to an ADSL outage this is copy/pasted from the web archives)

On Wed\, Feb 29\, 2012 at 10:46:43AM -0500\, George Greer wrote:

What's on your wishlist?

I fear that this isn't complete\, as I think I've forgotten something.

It's partly that (as best I can tell) the code you're running locally has diverged from the "upstream" code\, so it's unclear whether bugs fixes and other improvements are getting made in more than one place\, which is a duplication of effort.

My Test::Smoke customizations are:

1. Run 'make minitest' in addition to other tests.

If others find that useful too\, we should make that optional.

Well\, I find it useful that at least one smoker is running it. I don't know what the right way to customise it is.

(Although I never did investigate how to get the report matrix to add that as a column...)

3. Also allow ".config" suffix on configurations. (Previously "_config".)

Abe?

4. Get version (what is between the brackets in smoke email subject line) from $ENV{TEST_SMOKE_BRANCH}\, if present\, instead of repeating Perl version. (This is what makes the "Smoke [blead]" in my reports.)

As no mails will be send\, this is moot

Not totally. I'd still like to be getting mails for my failed smoke-me branches\, and it's nicer with the branch name. Also\, I'm not *sure* if I'd prefer to also opt-in to get e-mail for failures on blead (or any branch tested) if it was my fault - ie approximated by "the head commit was me"\, in which case it's also useful for it. But that's starting to sound more complex than just "convention is that smoke-me smokers notify on failure" with the opt-out being "well\, don't create a smoke-me branch then"

* I can get the logs

In the new setup\, logs are being sent to the database is the final status was not "PASS". This is configurable.

oooh. nice.

True\, that's necessary for any 'smoke-me' service\, Although the dashboard Tony Cook has is even better. (http://perl.develop-help.com/reports/)

I wonder if he has that on github...

Good question.

but also I'd like a couple of visibility bugs in your setup to be fixed:

X X O X X X O X -Uusenm -Duseithreads -Dmad | | | | | +- LC_ALL = en_US.utf8 -DDEBUGGING | | | | +--- PERLIO = perlio -DDEBUGGING | | | +----- PERLIO = stdio -DDEBUGGING | | +------- LC_ALL = en_US.utf8 | +--------- PERLIO = perlio +----------- PERLIO = stdio

8 results vs 6 annotations\, or 4 results vs 2 annotations:

O F F F O F F F -Duseithreads | +--------- -DDEBUGGING +----------- no debugging

Yes\, my adding 'minitest' to all runs did that. I do need to fix that.

Yes\, please magically find time to fix it :-)

Although it might be better to spend that same time migrating to the new Test::Smoke code\, if it makes it easier to add custom columns.

after which I guess that there are more general skimming issues with the smoke output. I'm familiar with it\, and I find it easy to read\, but others are not paying attention to the smoke output because they perceive it as impenetrable noise. It would be useful force those people to explain what they find most obnoxious about it\, fix that\, iterate until they run out of complaints.

I agree. My biggest wishlist for Test::Smoke is (automatically) keeping track of when a test first started failing so it can differentiate between

The new setup registers start time of every smoke-configuration subset and the duration thereof

"failure" and "known failure"\, and while we're at it "sporadic failure" would be nice. I find the Test::Smoke code...dense...but then I've mostly

Sporadic failures should now be detectable\, as we store failures per test file\, so you could make a trendline for e.g. op/read.t

Nice

First off\, I'm not sure whether the line "Summary: PASS" (or FAIL...) should be the first line\, with 2 (or 3) blank lines beneath it. But I'm not the target for such improvements - really the monthly release managers are the people whose input we should be getting.

There was a paucity of replies to this particular thread so we may never know.

We will be on #smoke in irc.perl.org to discuss wishes (when possible)

I suspect we (also)(for some value of we) need to be more systematic in asking monthly release managers what confused them in the smoke reports.

Easiest way to make that happen seems to be to make a question about smoke reports a regular fixture on the Onionsketch agenda.

Nicholas Clark

p5pRT commented 12 years ago

From @greerga

On Thu\, 5 Apr 2012\, H.Merijn Brand wrote:

On Wed\, 4 Apr 2012 21:02:07 -0400 (EDT)\, George Greer

4. Get version (what is between the brackets in smoke email subject line) from $ENV{TEST_SMOKE_BRANCH}\, if present\, instead of repeating Perl version. (This is what makes the "Smoke [blead]" in my reports.)

As no mails will be send\, this is moot

It does keep track of the git branch in the database though\, correct?

"failure" and "known failure"\, and while we're at it "sporadic failure" would be nice. I find the Test::Smoke code...dense...but then I've mostly

Sporadic failures should now be detectable\, as we store failures per test file\, so you could make a trendline for e.g. op/read.t

Is there functionality to ignore certain tests failing on particular servers? For example\, my Win32 smoker fails various timing tests due to it being in a VM with a bouncy clock which aren't otherwise interesting. My Linux smoker though wouldn't have any clock problems (other than tests that make assumptions about load average).

-- George Greer