speced / bikeshed

:bike: A preprocessor for anyone writing specifications that converts source files into actual specs.
https://speced.github.io/bikeshed
Creative Commons Zero v1.0 Universal
1.12k stars 200 forks source link

Release Notes #1773

Open tabatkins opened 4 years ago

tabatkins commented 4 years ago

(This thread documents the updates in comments; this original comment always reflects the latest version.)

Version 4.2.1

tabatkins commented 4 years ago

Version 2.0.0

tabatkins commented 4 years ago

Version 2.1.0

tabatkins commented 3 years ago

Version 2.3.0

Whoops, didn't write release notes for the 2.2 versions! Oh well!

tabatkins commented 3 years ago

Version 2.3.1

Version 2.4.0

tabatkins commented 3 years ago

Version 2.4.1

tabatkins commented 3 years ago

Version 2.4.2

Version 2.4.3

tabatkins commented 2 years ago

lol been skipping release notes for a while!

Version 2.4.4

Version 2.4.5

Version 2.4.6

Version 2.4.7

Version 3.0.0

Version 3.0.1

Version 3.0.2

Version 3.0.3

Version 3.0.4

Version 3.0.5

Version 3.0.6

Version 3.1.0

Version 3.2.0

Version 3.3.0

Big new WPT features - inline test results!

And some other random stuff:

Version 3.3.1

Version 3.3.2

Version 3.3.3

Version 3.3.4

Version 3.4.0

tabatkins commented 2 years ago

Version 3.4.13.4.2

(I accidentally forgot to pull on my release box before I cut the 3.4.1 release, so 3.4.2 actually contains all the stuff.)

tabatkins commented 2 years ago

Version 3.5.0

Version 3.5.1

Version 3.5.2

tabatkins commented 2 years ago

Version 3.6

Version 3.7

tabatkins commented 2 years ago

Been a while since my last batch of release notes!

Version 3.7.1

Version 3.7.2

Version 3.7.3

Version 3.7.4

Version 3.7.5

Version 3.7.6

Version 3.7.7

Version 3.8.0

Version 3.8.1

Version 3.8.2

Version 3.9.0

tabatkins commented 1 year ago

New big batch of release notes:

Version 3.10.0

Version 3.10.1

Version 3.11.0

Version 3.11.1

Version 3.11.2 & 3.11.3

Version 3.11.4

Version 3.11.5

Version 3.11.6

Version 3.11.7

Version 3.11.8

Version 3.11.9

Version 3.11.10

Version 3.11.11

Version 3.11.12

Version 3.11.13

Version 3.11.14

Version 3.11.15

Version 3.11.16

tabatkins commented 1 year ago

Version 3.11.17

Turns out using just a little of a new HTML parser causes problems when my old hacky line-based code is, well, doing things line-by-line, and thus only feeding the parser part of a tag when a tag is split across multiple lines. So now I just use the HTML parser for the entire pre-Markdown parsing pipeline. (Then Markdown runs, generates some HTML, and feeds its result straight to the lxml parser. Both of these will be rewritten in time as well.)

It looks like this doesn't cause any issues with the testsuite, and hopefully should resolve the spurious fatal errors that 3.11.16 was throwing. Lmk if you see any issues!

tabatkins commented 1 year ago

Version 3.11.18 - 3.11.21

Post-Mortem

So, 3.11.14 caused some problems. I was throwing my new HTML parser at my datablocks.py code, so I could reliably parse the start tag and preserve all of its attributes. Problem is that the datablocks code was processing the spec line-by-line, so if you had a multi-line tag (such as, for example, having a newline in an attribute value), the parser would complain that it hit EOF without the tag being closed. This didn't have an effect on any of my tests, because nobody had a multiline tag for one of their datablocks, so it didn't actually matter for the spec itself. It did matter that it was suddenly throwing spurious fatal errors, however, as that breaks people's builds. (First mistake.)

At this point I could have reverted the code and worked on it more in a branch, and should have. (Second mistake.) Instead I dug in - my new code was nicer, and I wanted to replace all my hacky text munging anyway, so I might as well just rip it out entirely and use my new parser.

This took a few days, but it was productive work, and I was happy with it. I did this via a big chain of "wip" commits, with test rebasing going on arbitrarily in commits, so when I thought I was done, I did a big squash commit - git reset --soft to the first commit, commit all the code, rebuild tests and verify, commit them, done. The tests looked good! Surprisingly few tests changed, and those that did were almost entirely catching errors, which was great! Problem is I did the squash wrong - I needed to reset to the commit before my first in the chain. Due to my mistake, I didn't see all the test changes my code had made, only the changes my commits after the first commit made. So I ended up missing a ton of problems that would have been fairly obvious. (Third mistake.)

Takeaway Fixes

domenic commented 1 year ago

Thanks Tab for the quick response and the postmortem. I'm really excited about work on tracking build messages, and also only making changes as PRs on GitHub. The way in which ~half of the commits on https://github.com/speced/bikeshed/commits/main have red Xs next to them has made me very nervous about using Bikeshed in the past, and moving to a PRs-with-CI-checks model would be a huge boost to my confidence in the software. Especially once those CI checks include build messages.

tabatkins commented 1 year ago

only making changes as PRs on GitHub

That's not what I said, and definitely not what I intend. The problem here was that I thought I fixed my branch appropriately, then merged it into main; had I pushed my branch to github and merged via the PR mechanism, I'd have seen that rather than the ~20 tests I thought had changed, several hundred had, and realized I'd done something wrong.

What I decided was to use PRs to merge "whenever I do this kind of significant work in a branch". Working in main directly is still fine for smaller things.

The way in which ~half of the commits on main (commits) have red Xs next to them

A large % of those are linting failures (mostly black complaining it wants to reformat something), and a large % of the remaining are from me doing test updates as a separate commit from the code updates for convenience. For the remaining, I'm happy with my work mode being to treat main as my trunk, an occasionally messy place, and letting CI remind me when I make a mistake - that's what it's for.

The "stable" channel is meant to be the releases via pip, which I'm much more conservative with. (The failure noted here notwithstanding; as documented, it was a result of two overlapping failures that caused my test suite to be green and me to think that the test changes were minimal and desirable.)

domenic commented 1 year ago

Oh. That's disappointing. I wish you would treat main with the same care as you treat releases, and used CI to validate your changes, like modern software best practice requires (e.g. on the Chromium project we both contribute to).

Now that Bikeshed is part of SpecEd, is there a way to ask "SpecEd" for that sort of working mode?

tabatkins commented 1 year ago

I wish you would treat main with the same care as you treat releases, and used CI to validate your changes, like modern software best practice requires (e.g. on the Chromium project we both contribute to).

Why? This isn't a rhetorical question - what is the reasoning behind you wanting this?

Chromium is a massive multi-contributor project; it needs a core that is stable/correct for all the contributors to work from. Bikeshed is largely a single-contributor (me) project, and I don't believe my working mode is what's stopping more contributions. Bureaucracy exists for a good reason, but it's also a significant drag on its host; this is more than worthwhile when it's cutting thru the N^2 growth of personal connections between contributors, but not when that N is very low (again, approximately 1 here).

The group of people that do need stability in reference to Bikeshed are, currently, just the users, and they're served well by my release process. You are also in that group; as WHATWG uses the API server, which now follows releases rather than main, you're exposed solely to what I do in releases. What happens in main is irrelevant in practice for you.

Now that Bikeshed is part of SpecEd, is there a way to ask "SpecEd" for that sort of working mode?

No. I moved Bikeshed to SpecEd for continuity reasons; my getting hit by a bus will no longer mean that the development has to fork to continue. I did not move it to enable non-contributors to impose currently-unneeded bureaucracy on the project.

domenic commented 1 year ago

It's sad to me that you see proper software engineering practices (which are used by teams of all sizes, including one) as bureaucracy. If there's such an ideological gap at this point, I don't think we'll be able to make further progress through discussion, at least not until more incidents of this sort (or others we've seen throughout Bikeshed's history) accumulate.

dlaliberte commented 1 year ago

I wonder if there is just a misunderstanding going on here. Given my limited knowledge of git, I may be missing something, but a few things are clear to me, and as one of the contributors of a small fraction of changes to Bikeshed, I may have a few useful insights.

One thing we should be able to agree on is that the error status of commits should be considered totally irrelevant as long as the last commit before squashing and merging is "green". The commits are just intermediate states that are steps along the path toward a final PR submission, which is true both for main and any branches. In fact, I find the extra steps of staging changes, then committing changes, and then syncing changes, almost entirely annoyingly extraneous.

Doing independent work on branches separate from origin/main, such that once we merge a PR from any branch into main, it will typically require only a single commit that very likely will have no errors, is yet another step that I am happy to avoid, as long as we can avoid problems. I believe, with small teams, this is common practice, and it works fine as long as there are sufficient tests to ensure PRs won't cause unexpected problems. I don't know what the dividing line between small and large teams is, but it probably depends a lot on the nature of the project as well, and how the work is typically distributed.

The Bikeshed repo already does have a moderately large test suite that is applied after each commit. Running these same tests locally, before commit, is also possible, but not essential to ensure a reasonable level of integrity. Sharing intermediate state of a draft PR with co-workers by way of committing changes that might still be very incomplete and full of errors is a very useful technique which we should not sacrifice.

Whether origin/main itself with all the current accumulation of PRs has errors is another question. I would expect that we should never merge in PRs that result in known errors, unless we have high confidence that they are spurious, or that we will quickly fix the errors in a subsequent PR. Then the next boundary condition happens when origin/main is released to the public, which should only happen once more rigorous testing and sufficient reflection result in even higher confidence that everything should be OK.

So, is the discussion about whether PRs to origin/main should ever result in known errors, or whether we have sufficient tests to have confidence about whether known errors are relevant, or something else?

tabatkins commented 1 year ago

@domenic I'm pretty sad to see this level of condescension, yes. I asked specific questions and gave specific reasoning for my decisions, and I'm getting back generic platitudes about "good software engineering", implying strongly that I am being a bad software engineer by not immediately adopting your suggestions.

Bureaucracy is not a bad word: it is both an essential part of any organization as it grows (to linearize communication pathways as the network of participants grows), and a significant cost to all participants (as they have to service the bureaucracy rather than working directly on the task or with each other). Adding bureaucracy needs justification.

Whether origin/main itself with all the current accumulation of PRs has errors is another question. I would expect that we should never merge in PRs that result in known errors, unless we have high confidence that they are spurious, or that we will quickly fix the errors in a subsequent PR. Then the next boundary condition happens when origin/main is released to the public, which should only happen once more rigorous testing and sufficient reflection result in even higher confidence that everything should be OK.

Right. Bikeshed started as a single-developer project (with frequent but relatively minor contributions from others) with a small community of users (the CSSWG). At that time, working directly in main, and serving directly from main, made sense; the community was small and close-knit, so problems could be immediately communicated and fixed. Keeping main in a working condition was important, but short-term failures, quickly detected and fixed, were okay. Branches were used when doing larger projects that needed the checkpointing of commits, but might leave the codebase in a broken state until it's finished.

Bikeshed has grown since then. Currently the developer situation is still fairly similar - it's mostly just me, with occasional contributions from others. (In fact, spot contributions have mostly dried up as it's become larger and more complex.) So working in main, with occasional branches when necessary, is still pretty reasonable.

But the community has massively expanded, to encompass most of the W3C and a large and diverse set of disparate users elsewhere. I can no longer count on people reporting issues to me; it's more likely that they work around the issues instead. (I do still get plenty of bug reports, which I of course love.) And due to this size, it's more likely that every exposed version is used by a non-trivial number of people, so short-lived failures can still cause significant pain. Thus moving to a release model made a lot of sense, and this has been largely successful. This also means that users are insulated from whatever happens before the point of "cutting a release".

If Bikeshed ever starts getting more community contributions, more stringently maintaining a known-good main would make sense, so short-lived failures accidentally introduced by one dev don't overly affect the other devs. Until that happens (and I'm not confident it ever will), the current main shepherding continues to work fine. We could also change it if we believe that the current main shepherding is preventing people from feeling comfortable contributing, but I don't believe this is the case currently.

tabatkins commented 1 year ago

Version 3.11.22

Version 3.11.23

Version 3.12.0

Version 3.12.1

Version 3.13.0

tabatkins commented 1 year ago

Version 3.14.0

Let's try that HTML parser rewrite again! Coming at it more gradually now; I do one early pass over the document with the new parser and then just reserialize back to a string. I haven't yet replaced the datablock, markdown, or shorthand parsing, and I still pass it to the pre-existing HTML parser to actually build a tree, so mostly this is just some extra work being done, but those'll all be absorbed into this work in later phases.

As I say in https://github.com/speced/bikeshed/pull/2602, this brings a few benefits immediately:

It then still runs the datablock, markdown, and finally the existing HTML parser over the spec, so this is probably slightly slower at the moment, but those will be eaten by the parser in later phases.

Known issues with existing specs' markup:

If you were working around the hacky ` parsing by using \` inside of a code block (which shouldn't have any parsing done inside) it no longer does any parsing inside. Fix: Remove the spare \ so you just have valid JS again. If you weren't working around this your spec was probably broken, and now it's fixed!

The HTML parser runs before the Markdown parser right now (fully integrating the two together is the next project), so a tag broken across a line inside a blockquote will parse incorrectly (it's closed prematurely by the blockquote's > at the start of the next line) Fix: just put the whole tag on one line for now.

Previously, ''&lt;foo>'' would make a maybe autolink to <foo> as a value. Now it's equivalent to <css>&amp;lt;foo></css>, which is broken, but arguably it was always broken in the first place. Fix: change to ''<foo>''.

The Markdown behavior of "if you have spaces at both the start and end of a code span, remove one from each side" is now properly implemented. A few specs relied on it removing any amount of any whitespace, so now there's an extra space sometimes if you did a linebreak inside your code span for some reason. Fix: Put the code span all on one line, or at least don't linebreak between the end of the content and the closing ticks. (The serializer still collapses starting/ending whitespace down to a single character to make the content look better, so that might still be doing more stripping than you want too.)

tabatkins commented 1 year ago

A few small updates in the last few weeks.

Version 3.14.1

Version 3.14.2

Version 3.14.3

Version 3.14.4

tabatkins commented 1 year ago

Version 3.14.5

tabatkins commented 11 months ago

Version 4.0.0

Woo, major version bump! Content-wise this isn't a break-worthy update, but Bikeshed has officially deprecated support for Python 3.7 (and 3.8), and now requires a minimum version of 3.9. As far as I can tell, this is the highest widely-supported version among default installs now; the old OSes that bundled 3.7 by default are all now out of long-term support.

Most of my time was spent on making Bikeshed's new HTML parser more powerful and more correct (mostly in https://github.com/speced/bikeshed/commit/5c6f7af73bbe345046023dc3c6ba4704044ff07a). This shouldn't have any big effect on you (some caveats, below), but every step forward I make here makes me so much happier.

I did a lot of careful test review to make sure things are as correct as possible. Aside from the bits I listed above, and the possibility of some new markup errors being flagged,

tabatkins commented 11 months ago

Version 4.0.1

tabatkins commented 11 months ago

Version 4.0.2

Version 4.0.3

Neither of these should actually affect specs in a user-observable fashion; let me know if anything seems to be newly wrong.

tabatkins commented 10 months ago

Version 4.1.0

Several miscellaneous fixes; I wanna get these out before I dive back into the parser.

tabatkins commented 10 months ago

Version 4.1.1

A few small fixes.

tabatkins commented 10 months ago

Version 4.1.3

tabatkins commented 10 months ago

Version 4.1.4

tabatkins commented 8 months ago

Version 4.1.5

Tiny release.

Version 4.1.6

tabatkins commented 6 months ago

Version 4.1.7 & 4.1.8

(4.1.7 was an accidental no-op release.) Tiny update!

tabatkins commented 1 month ago

Whoops, left a few version off the changelist. They're all minor except the last.

Version 4.1.9

Version 4.1.10

Version 4.1.11

Version 4.1.12

Version 4.2.0

tabatkins commented 1 month ago

Version 4.2.1