Discussion: Project/Repository Ideas

jgarber623 commented 4 years ago

Hello, all!

Spurned on by @aimee-gm's #118 and the ensuing conversation in #microformats chat, I'd like to open this discussion with some thoughts on we might improve this project and repo for its users.

At a high level, this project provides a set of input (HTML) and output (JSON) files for microformats parser developers to test their projects against. The core focus of development on this project should be in improving the quality, breadth, and depth of those inputs and outputs. They should reflect the consensus of the community and its documentation (the wiki, mostly).

My strongest opinion is that anything beyond those inputs and outputs is superfluous to the primary function of this project and should be removed from the code. This would include the Node.js-specific code, CSS, etc.

Now, that's a very strong opinion, so I'll propose a few alternatives as well. 😄

Continuous Integration

Configuring continuous integration has been brought up before (#88) and came up again in chat. In my mind, we'd use CI for a few things:

Formatting conformance (linting of HTML and JSON)
Tagging a commit to master as a "release" (maybe using the datetime YYYYMMDDHHMM?) and pushing that tag to GitHub

The latter is more useful if we create language-specific packages in this repo. @aimie-gm noted this in #microformats chat:

I would only suggest going the route of maintaining packages in the various package managers if they were automatically published in ci

Which brings us to…

Language-Specific Packages

We could use this repository to build and distribute language-specific packages (Node.js packages, Ruby gems, Python packages, PHP… things ). That'd get us into mono-repo territory and would introduce some organizational challenges, etc. But… it's possible.

Google does this with their Protocol Buffers repo. It's not impossible, but would def. add some overhead. A CODEOWNERS file would come in super handy in this case. We'd then use CI to generate, tag, and push those various packages to their distribution sites.

Meta

A grab bag of things we could add to this project:

A CODEOWNERS file (helpful for PR reviewers, etc.)
Issue and Pull Request templates (generally following this documentation
Ditching the hand-rolled change-log.html files in favor of relying on the Git commit history (which truthfully does the exact same thing…)
More obviously direct people to other microformats repositories that discuss the spec, parser updates, etc. and to IndieWeb chat,
Document the process by which change requests are reviewed and merged

…that's what I've got right now. I'm excited to hear your feedback. Thanks for reading!

aimee-gm commented 4 years ago

My feedback at the moment (given I'm new here 😄)

:+1: to remove superfluous code. It detracts from the purpose of this project and makes it confusing to add new test cases. The emphasis should be on encouraging good tests, and those files are all that's needed.

:+1: to CI, and linting the JSON/HTML files. Although we could introduce something like prettier that would reduce the need for linting and even be run on a pre-commit hook. Perhaps we would just need to verify that the JSON & HTML are valid, though? But, we may want to add test cases for invalid HTML at some point? 😉

:-1: to publishing to each individual package manager. All it takes is for one auth token to become revoked accidentally (or intentionally) on one of those platforms and it'll become broken, and unlikely to be fixed unless you have maintainers active (and in an organisation) on those platforms. I also don't see any demand for it at the moment.

:+1: to issue and PR templates, would be massively useful in helping people contribute. I would rather that updates were made here than my own parser, for example. Making this repo friendly to new people would benefit other parsers, too.

:+1: to ditching the change-log, it's another barrier to contributing.

Thanks for this write up and ideas!

Zegnat commented 4 years ago

At a high level, this project provides a set of input (HTML) and output (JSON) files for microformats parser developers to test their projects against.

The use you mention is what parser developers have grown to use this repository for, but it was not its original purpose. If the original purpose (being included within Firefox for testing) can be kept, or if that original purpose is no longer valid, only then I think true clean-up and removal of code should be done.

This probably needs some sort of resolution to #74.

That aside, I agree that only having inputs and outputs for testing in a repository is the way to go!

Formatting conformance (linting of HTML and JSON)

I would be hesitant to implement any form of HTML linting. A lot of documents are valid HTML, even more are parsable by spec compliant parsers into a valid HTML tree regardless of validity. Sometimes test cases may want to specifically test the outcome of somewhat weird looking HTML that may not pass whatever linting you are thinking of applying.

I have no opinions on JSON linting. JSON documents are either valid or not, IIRC, so how we chose to render them is not something I have strong opinions on.

Anything else I haven’t specifically mentioned here I am either also neutral on, or would love to see. Big 👍 to everyone who has been breathing new life into this repository!

microformats / tests