syntax-tree / hast-util-sanitize

utility to sanitize hast nodes
https://unifiedjs.com
MIT License
49 stars 20 forks source link

Docs schema #28

Closed ben519 closed 11 months ago

ben519 commented 1 year ago

Initial checklist

Description of changes

TODO

github-actions[bot] commented 1 year ago

Hi! It seems some of the things asked in the template are missing? Please edit your post to fill out everything.

You won’t get any more notifications from me, but I’ll keep on updating this comment, and remove it when done!

If you need it, here’s the original template ```markdown ### Initial checklist * [ ] I read the support docs * [ ] I read the contributing guide * [ ] I agree to follow the code of conduct * [ ] I searched issues and couldn’t find anything (or linked relevant results below) * [ ] If applicable, I’ve added docs and tests ### Description of changes TODO ```

Thanks, — bb

ben519 commented 1 year ago

Appreciate the feedback. I'll try to fix these items tomorrow.

ben519 commented 1 year ago

Hey, I reviewed this a little more and I thought I'd share my thoughts..

  1. Consider this example from the current docs
// This allows `className` on all elements
const schema = deepmerge(defaultSchema, {attributes: {'*': ['className']}})

This is a little bit misleading. Yes, it allows className on all elements but only if that className does not contradict another rule in the schema. For example, h2 can only have the class name 'sr-only' due to the rule for h2 in the default schema.

I know you mentioned fixing this "bug", but personally I don't see this as a bug. In English, to me this means

Every attribute should be allowed to have the className property

which is distinctly different than saying

The className property should be allowed to have any value

Frankly I think both of these objectives should be supported. (Perhaps they already are, but I think only the first one is.)

  1. The existing schema structure is kind of awkward. For example, input: [['type', 'checkbox', 'radio']] allows type equal to 'checkbox' or 'radio'. This would be more clear with a data structure like input: { type: ['checkbox', 'radio'] } and this data structure would be easier to extend.

Let me know how you want me to proceed. I meant to spend ~10 mutes on this but I'm already a few hours in haha.

wooorm commented 11 months ago

I know you mentioned fixing this "bug"

Which bug did you think I meant?

For me this is about solving what these mean:

const a = {
  attributes: {
    '*': ['className', 'x'],
    h2: ['className', 'y']
  }
}

const b = {
  attributes: {
    '*': ['className'],
    h2: ['className', 'x']
  }
}
wooorm commented 11 months ago

Yeah so, trying to solve this, runs into two actual problems.

GH allows disabled on all elements. It’s useless but you can have a <p disabled="x"></p> and it will remain. So we allow it with any value: https://github.com/syntax-tree/hast-util-sanitize/blob/fafff8ac1073414ef7babb6063ed9bec9b2f2cb6/lib/schema.js#L92.

However, GH normally doesn’t allow inputs at all. Only if they themselves generate them from tasklists (- [ ] to do). To get as close as we can, we also only allow disabled checkboxes: https://github.com/syntax-tree/hast-util-sanitize/blob/fafff8ac1073414ef7babb6063ed9bec9b2f2cb6/lib/schema.js#L46-L49.

So this behavior conflicts: if disabled is normally allowed to have any value, your wish also affects inputs.

There’s one more case: id, which is now forced by h2 to be 'footnote-label', and generally allowed.

disabled is useless outside of input, so IMO we can drop that. And if we by default allow id already, it’s not needed to lock it down on h2. Also, maybe this was a mistake on my part because id should indeed be on headings by default and clobbering solves them

github-actions[bot] commented 11 months ago

Hi! This was closed. Team: If this was merged, please describe when this is likely to be released. Otherwise, please add one of the no/* labels.

ben519 commented 11 months ago

Hey, saw your latest commit. I'll test it out and let you know if I run into any problems. In regards to your question, I would assume the answers below

const a = {
  attributes: {
    '*': ['className', 'x'],  // every element can have a className of 'x'
    h2: ['className', 'y']  // h2 can only have a className of 'y'. This supersedes the more general rule above
  }
}

const b = {
  attributes: {
    '*': ['className'],         // every element can have a className with any value
    h2: ['className', 'x']   // h2 can only have a className of 'x'. This supersedes the more general rule above
  }
}

In other words, my assumption is that granular rules supersede (overwrite) generic rules.