Open digitalmoksha opened 5 months ago
If I remember correctly, at the time, I thought that this seemed unnecessary, since as you said the same action can be accomplished with node filters. As I look at this again, though, I could see a use case for some kind of config to simplify node addition/removal. Something like (spitballing):
remove_unsemantic: ["li", Config::TABLE_CHILD_ELEMENTS]
Since I can imagine some people would use this gem to create well-formed markup, perhaps the gem should just make it much easier to indicate whether you want parents or children of a certain node moved/replaced/deleted/etc.
Does that answer your question, or did I misunderstand?
I guess initially I was curious if you thought those conditions still needed to be sanitized. If so, it might make sense to include those in the filter, as handlers passed into Selma::Rewriter.new(sanitizer: sanitization_config)
.
Since I can imagine some people would use this gem to create well-formed markup, perhaps the gem should just make it much easier to indicate whether you want parents or children of a certain node moved/replaced/deleted/etc.
I can see the usefulness of that. Though currently my transformers are more along the lines of remove the class unless the class == 'anchor'
, or remove the id unless it's a footnote or other parser generated id
. I think I will need handlers for those.
For now I plan on using Selma directly for sanitization until I can do the much heavier lift of upgrading html-pipeline
In the original SanitizationFilter, which uses
Sanitize
, you had two transformersI don't see these in the new version based on Selma. Are these transformations no longer needed?
I assume that any transformations would need to be written as Selma handlers. I don't see a way to add those to the sanitization config, so I'm guessing these would need to be node filters.