whatwg / meta

Discussions and issues without a logical home
Creative Commons Zero v1.0 Universal
93 stars 159 forks source link

Shutting down the WHATWG mailing lists #153

Closed foolip closed 4 years ago

foolip commented 4 years ago

This is a tracking issue for the plan proposed in https://github.com/whatwg/misc-server/issues/75#issuecomment-561625392 to close the WHATWG mailing lists:

Since the volume on these lists is so low and the effort to keep them working indefinitely non-trivial, the new plan I'd like to propose is:

This issue is filed in case someone is following this repo but not the mailing list, and yet has feedback on the plan.

The pull requests that implement the suggestion are https://github.com/whatwg/whatwg.org/pull/269 and https://github.com/whatwg/misc-server/pull/120.

foolip commented 4 years ago

I've found that although http://lists.whatwg.org/ says "The list overview page has been disabled temporarily", the archive pages are accessible and I'd like to make some effort to save them: http://lists.whatwg.org/pipermail/commit-watchers-whatwg.org/ http://lists.whatwg.org/pipermail/help-whatwg.org/ http://lists.whatwg.org/pipermail/implementors-whatwg.org/ http://lists.whatwg.org/pipermail/whatwg-whatwg.org/ (partial because of some mishap in 2017)

domenic commented 4 years ago

Those are not accessible on HSTS-enabled browsers though, right?

foolip commented 4 years ago

They're not, but they appear to be the full archives of help and implementors lists, so scraping them like we did with the forums is probably low enough effort that it's worth saving.

foolip commented 4 years ago

I announced the proposed change on the list itself and pointed to this issue here: http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2019-December/000148.html https://lists.w3.org/Archives/Public/public-whatwg-archive/2019Dec/0000.html

Can anyone confirm they got that email, just to be sure?

zcorpan commented 4 years ago

I got the email.

foolip commented 4 years ago

I've been digging into what archives exist and what could be reconstructed. Current understanding:

With significant effort the original state of lists.whatwg.org can probably be reconstructed apart from September 2014 through July 2017. In that period https://lists.w3.org/Archives/Public/public-whatwg-archive/ is the only copy, so at best one could recreate a listing with other numbers that are plausible, but it wouldn't match the original URLs, whatever they were.

Doing some wget scraping now to figure out how complete the web.archive.org copy is.

zcorpan commented 4 years ago

Does anyone have the full archives of whatwg list locally? If I remember correctly, there was a header Archived-At with the URL of that email in the archive.

Edit: I don't see such a header in the latest email. Maybe it's only for lists.w3.org emails?

kosek commented 4 years ago

I have all emails from that period locally in my Thunderbird. So I can export them make available to someone who is willing to recreate archive. But there is no Archive-At header.

foolip commented 4 years ago

@zcorpan the version of mailman and configuration probably changed a fair bit over time, so it's possible the headers changed. Or possibly you're thinking of the Message-ID header from which one could construct permalinks?

zcorpan commented 4 years ago

No it contained a URL. But it was probably only for W3C lists...

foolip commented 4 years ago

Yeah, I see Archived-At: <CAOOOkFcWW97r8yg=SsWg7GgCmp4suVX9o85y8BvNRqMjuc5PXg@mail.gmail.com> as a header on the email I sent about closing the lists.

Looks like this might actually be useful in an unexpected way! Pulling a Message-ID from https://web.archive.org/web/20140706121731/http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2009-December.txt.gz and creating the URL https://www.w3.org/mid/op.u4kq0w0oidj3kv@zcorpandell.linkoping.osa works, I end up at https://lists.w3.org/Archives/Public/public-whatwg-archive/2009Dec/0103.html. With that I think a mapping can be created.

foolip commented 4 years ago

So clearly I've gone down the path of trying to keep whatwg@whatwg.org archive URLs from before prior to 2017 working where possible, or rather to revive them as they're currently 404.

Some notes:

Based on all this it should be possible to create redirect rules or a 404 page that checks all possible URLs in web.archive.org for a match.

However, since the pre-2017 archives aren't available on lists.whatwg.org, restoring them shouldn't block turning off the mailing lists, but would be nice to get done.

gosko commented 4 years ago

@foolip I'm curious about this:

https://lists.w3.org/Archives/Public/public-whatwg-archive/ has all of whatwg@whatwg.org but with different message IDs assigned, and a mapping is non-trivial

The message-ids in W3C's copy seem to be the originals, as far as I can tell.

I'm not sure if this would be useful (and maybe you already know) but W3C's archives are available in mbox format, e.g. https://lists.w3.org/Archives/Public/public-whatwg-archive/mboxes/ (restricted to W3C Members to limit bulk harvesting by spammers)

foolip commented 4 years ago

@gosko in that context by "message IDs" I mean the numbers in the URLs like in https://github.com/whatwg/meta/issues/153#issuecomment-566980200, not the long identifiers in Archived-At headers as in https://github.com/whatwg/meta/issues/153#issuecomment-566787010.

I had noticed that the mbox archives are available, and think those could come in handy for reconstructing. It's just a lot of work to do this well and with confidence in the results given that you could at best sample the results to look for problems.

foolip commented 4 years ago

The email lists have been shut down now. lists.whatwg.org now has no DNS records, and I'm looking at bringing up a static copy of what I've scraped and what can still be scraped from web.archive.org.

foolip commented 4 years ago

lists.whatwg.org has been restored as well as it can be from web.archive.org now. If people can try using it and report issues that'd be great. Known issues:

foolip commented 4 years ago

I see there's one more issue. Looking for broken links from the monthly listings, I see that May and June of 2014 don't have the individual messages: https://lists.whatwg.org/pipermail/whatwg-whatwg.org/2014-May/thread.html https://lists.whatwg.org/pipermail/whatwg-whatwg.org/2014-June/thread.html

https://lists.whatwg.org/pipermail/whatwg-whatwg.org/2014-May/254200.html is the only message.

These may be possible to restore from "Gzip'd Text", but I'll just link to lists.w3.org for these two months, as I've already done for July 2014 through June 2017.

Edit: fixed

foolip commented 4 years ago

With https://github.com/whatwg/whatwg.org/pull/285 this has been resolved.