w3c / feedvalidator

W3C-customized version of the feedvalidator (forked from https://github.com/rubys/feedvalidator/)
Other
81 stars 37 forks source link

Out-of-date hyperlink: beta.feedvalidator.org #100

Closed jayaddison closed 1 year ago

jayaddison commented 1 year ago

Describe the bug The output provided by feedvalidator includes contextual hyperlinks to learn more about reported errors.

In the case of an XML syntax parsing error, the following appeared:

XML parsing error: <unknown>:18:0: syntax error [[help](http://beta.feedvalidator.org/docs/error/SAXError.html)]

To Reproduce This occurred when feedvalidator ran on a site that contains XML comments (<!-- ... -->) before a doctype declaration. That's a maybe-valid but possibly-discouraged document format; either way, it produced the parser error mentioned, with the outdated hyperlink - clicking through on that does not currently (as of 2023-02-27) appear to open a W3C website.

Expected behavior It'd be nice for the hyperlink to open the intended W3C help text about SAX parsing errors.

Additional context None currently - please let me know if I can provide more information.

jayaddison commented 1 year ago

Sorta weird idea, and not the practical resolution for this bug, but:

If the Internet Archive created integrity hashes from the content of many/most/all URLs it encountered at given points in time, then it could create a mapping from (hash + timestamp-of-page) to an archived webpage.

A web browser that viewed, for example, a beta.feedvalidator.org URL could check with the IA to discover the time-of-writing of that URL (example response: OK, that beta feedvalidator URL was added on 2023-03-11 at timestamp <ts>), and then the client could request the contents of the destination URL from that previous point-in-time.

A drawback of that would be that any corrections and fixes applied to the destination URL might not be visible to the user. There's some kind of analogy with software semver there; minor version updates being acceptable and probably desirable by the user -- up-to-and-including anything non-compatible, and almost certainly not across non-transfer ownership changes (for example: a company selling their website to another company may be OK, although the user might want to decide about that -- whereas a domain expiring and being picked up by another organization is almost certainly not what the user wants to follow).

dontcallmedom commented 1 year ago

can you clarify whether the output you're referring to is from the user visible HTML interface or something else? The only place where I see the mention of beta.feedvalidator.org in the codebase is in the Unicorn output (which I was about to remove following the deprecation of the online service).

jayaddison commented 1 year ago

Yep, it was the Unicorn HTML interface in this case.

(and now I think I understand - the _ucn suffix in part of the filename refers to Unicorn? either way, we can close this by the sounds of it)