Closed regexident closed 3 years ago
The reason all func visit(…: …)
methods on protocol Visitor
return -> VisitorContinueKind
, regardless of whether the passed-in type actually has children
to visit, is to make the API more resilient against future changes.
@regexident What a great addition to this library! I was in such a rush to jump to HTML-rendered output and XPath queries for swift-doc
, that I neglected to appreciate the beauty of the CommonMark AST for what it was. Really inspiring stuff, and I'm a big fan of the API naming consistency with SwiftSyntax.
Is there anything left to do before this is ready for review?
I made it into a draft for now as I wanted to try getting support for subtype/override semantics, which needed me to change some stuff.
This would allow for visiting super-types, which can be super useful:
func visit(node: Node) -> …
func visit(block: Block) -> …
func visit(containerBlock: ContainerBlock) -> …
func visit(leafBlock: LeafBlock) -> …
func visit(inline: Inline) -> …
But with that one would either have to gave these higher-level visit methods have no say in controlling the walk via VisitorContinueKind
, or extend VisitorContinueKind
with an case .inherit
and implement some kind of overriding.
I am currently working on the latter as that would further more allow for really nice (i.e. need for very little boiler-plate when implementing Visitor
) support for building visitors that, e.g. with the exception of Paragraph
skip all Block
s:
final class TestVisitor: Visitor {
var numberOfTexts: Int = 0
func visit(block: Block) -> VisitorContinueKind {
return .skipChildren
}
func visit(paragraph: Paragraph) -> VisitorContinueKind {
return .visitChildren
}
func visit(text: Text) -> VisitorContinueKind {
self.numberOfTexts += 1
return .visitChildren
}
}
protocol Visitor
's default implementations of func visit(…:)
would default to .inherit
, instead of .visitChildren
, with self.defaultContinueKind
being the implicit root.
The order of precedence would be (a > b
meaning a can override b
):
T > Inline > LeafBlock > ContainerClock > Block > Node
Inline
, LeafBlock
and ContainerClock
should be disjoint sets, but given that we cannot guarantee this from a type-system perspective I defined an order that should reflect the nesting in a document, akin to "cascading" style sheets.
All of this complexity should remain fairly hidden in the package and require minimal configuration.
@regexident Just a quick heads-up: We're in the process of changing how containers are implemented (#11). I'll try to integrate that in the next day or two so that it doesn't block your work here.
We might also want to have at least a generic "end" event for Visitor
:
public protocol Visitor {
// ...
func didEndVisit()
}
So that a Visitor
could maintain an internal stack during traversal such as for inserting closing tags while building HTML from a document.
We might also want to have least add a generic "end" event for
Visitor
[...] so that aVisitor
could maintain an internal stack during traversal such as for inserting closing tags while building HTML from a document.
The way this is done in SwiftSyntax
is with visitPost(_:)
complements to visit(_:)
. So the last method that a visitor calls would be visitPost(_:)
for the Document
.
Yeah. My only concern would be that it would double the API-surface.
My thought was that if you needed info about the node post visit you could just keep a stack of nodes. That would certainly not be as efficient as having visitPost(…:)
, but due to nodes being classes should be reasonably efficient memory-wise.
That being said visitPost(…:)
would certainly be cleaner.
Anyway, I would be fine with either. I'll happily leave that decision up to you. 🙂
Done.
Lemme know what kind of visitPost(…:)
API you would prefer and I'll quickly add it. 🙂
Also, rather than merge now I'd say let's wait for https://github.com/SwiftDocOrg/CommonMark/issues/11 first, and I'll rebase once that's been merged.
Lemme know what kind of
visitPost(…:)
API you would prefer and I'll quickly add it. 🙂
While it would offer a smaller API surface area, I think using a single method for all "did visit" delegate methods has a strong likelihood of causing confusion. For example, a user may mistakenly see visit
for text elements as a specialization for visit
for inline elements, and expect only one of them to be called; so one "did visit" callback instead of two.
Given the strong example set by SwiftSyntax and the clarity of their adopted approach, I think visitPost
is a better option — in spite of its API surface area.
If you're particularly concerned about API surface area for consumers, one option might be to borrow the approach used by SwiftSyntax with its AnyVisitor
class
Also, rather than merge now I'd say let's wait for #11 first, and I'll rebase once that's been merged.
Agreed!
I just added the func visitPost
s. 🙂
With https://github.com/SwiftDocOrg/CommonMark/pull/14 closed (🙁) this PR should be ready to merge now.
@regexident Thanks for the reminder. Working to get this merged now!
Should be resolved. 🙂
Hi!
I'd love to try this, is there anything else needed to get this merged? Happy to be of any help that I can.
@mattt I don't expect you to remember 😀, but we met briefly over dinner with some folks when you gave a presentation at Airbnb! I'm now at Primer (where we almost crossed paths I believe!), and the Education team writes projects in Markdown.
I'm exploring using CommonMark to traverse the AST (for some custom components) + render these projects natively in a UICollectionView
.
Hey @mattt, is there anything else needed to get this PR merged?
Hate to bother you, @mattt, but any chance to get this merged at some point? (I have two projects where this would be tremendously helpful, but I'd rather not maintain a fork.)
@regexident Thank you for the final nudge on this and your extraordinary patience in waiting for this to get merged. I'll cut a new minor release with this (and your wonderful memory leak fixes) ASAP.
Edit: This is now live in 0.5.0
Thank you so much @mattt!
While accessing certain nodes in a document via
node.children
works alright for shallow document structures it becomes cumbersome as soon as you are interested in some more interesting queries such as "collect all links, from anywhere in the document".(Writing a visitor pattern in time of a pandemic might not have been the most responsible thing, but I did wear a :mask:, I promise.)