conda-forge / admin-requests

28 stars 313 forks source link

Add static hosted HTML page with PR builder forms #968

Open bollwyvl opened 8 months ago

bollwyvl commented 8 months ago

Comment:

elevator pitch

Provide a low-barrier way to make precise, pre-validated admin requests.

motivation

After looking at the GH PR templates feature (suggested in #535), I was unsatisfied with the specificity of the language (as usual, not quite JSON schema),

design ideas

So I wrote a thing that:

Here's a demo for an outrageously long schema:

https://deathbeds.github.io/jupyak/shaver.html

challenges

The downside: to get the nice UI (dropdown/autocomplete), all the feedstock names would need to be embedded in the schema, e.g.

"feedstocks": {
  "type": "array",
  "items": {
     "type": "string",
     "enum": ["aalto-boss", "a-few-others", "zziplib"]
  }
}

But this might be something that could be generated in one place...

{
  "$id": "https://conda-forge.org/schema/feedstocks.schema.json",
  "type": "string",
  "enum": ["aalto-boss", "a-few-others", "zziplib"]
}

And then referenced here:

{
  "feedstocks": {
    "type": "array",
    "items": {
      "$ref": "https://conda-forge.org/schema/feedstocks.schema.json"
    }
  }
}

implementation ideas

After the... experience... with pydantic over on conda-smithy, it seems like schema-first design (but perhaps authored in YAML) to get to a well-typed TypedDict might be easier and give strictly better validation.

alternatives

jaimergp commented 8 months ago

If I understood correctly:

If this works, it could also be a nice workflow for conda-forge-repodata-patches-feedstock (assuming we can start with existing files).

Also, this doesn't need any changes in conda-forge.github.io either, right?

PS: We do have a single JSON with all packages and feedstocks at https://github.com/conda-forge/feedstock-outputs/tree/single-file. Would that be a good starting point for the autocompletion schema?

jaimergp commented 8 months ago

Alternatively, provided we do generate the schemas here (so they are close to the code that ingests them), we can also add a couple pages to the website so everything "JavaScript-y" is defined in the same repo. And if we go down that route for the repodata patches repo, I'd assume there are several bits of code that would be shared.

bollwyvl commented 8 months ago

Well, the nice thing about JSON schema is it's not just JavaScript-y (or GitHub-y, pydantic-y, or whatever) and pretty much gives the same results in all the implementations.

But yes, the workflow would be entirely offline, static, running in the browser, but supported by all the engineering rigour we can throw at the process.

The "end product" is using pre-existing HTML URL features, so the user would be prompted in a non-scary way to do a standard GitHub login when they click on the link to propose the file, which then immediately suggests the PR workflow.

repodata patches repo

This would be possible as well, but that generated schema is gnaaarly, and would probably take some work to clean it up for human consumption. Both pydantic and msgspec support JSON Schema, but only the parts they want to.

Of note: that repo is slooooow to work with, and might benefit from a performance-focused implementation. Indeed, going from pydantic to msgspec would probably cut the runtime of that repo in half, if not more.

https://github.com/conda-forge/feedstock-outputs/blob/single-file/

At rest, no, that is just... some unlabeled JSON, which would require special JavaScript-y parsing to deal with effectively. If it was instead stored as a reusable, self-describing schema fragment (as described), and hosted at a generally-resolvable URL it could help in a number of places.

Of course, this falls down when one wants to use the values from the enum as the keys of a dict... schema only offers pattern, which would be a mighty regex indeed:

"$defs": {
  "a-valid-conda-forge-feedstock-name": {
     "type": "string",
     "pattern": "^(aalto-boss|a-few-more|zziplib)$"
  }
}

As for the outputs, I would probably make that a whole separate enum schema.

bollwyvl commented 1 week ago

Of note, the package i handwaved at earlier actually exists now: https://urljsf.rtfd.org

I've got a minimal example that shows dropdown fields derived from feedstock-outputs and the SPDX license lists:

https://urljsf.readthedocs.io/en/latest/demos/installer.html

The datalist UI isn't perfect, but 24k of anything makes most non-browser-native things fall over, so i'll take what i can get.

assuming we can start with existing files

I presently don't have a way to dynamically load starter data by URL (or pasted in or whatever), though have been considering what this means in the general case... uncontrolled data sources raise the threat of e.g. XSRF, and can more easily run afoul of CORS, etc.

Also many tools fall down on the ----delimited multi-YAML-document style chosen for repodata-patches unless you're big enough (like k8s) to make special cases everywhere. Going to a patches: top-level element would at least fix that, but doesn't solve "contribute a new patch given no prior knowledge".

As the GitHub PR form can create new files and their parent folders, something that would get around both of those limitations would be a conf.d-style approach, where each feedstock was a folder, with any number of {feedstock}/{some-name}.yml within in, presumably processed in alphabetical order.