CXuesong / MwParserFromScratch

A basic .NET Library for parsing wikitext into AST.
Apache License 2.0
18 stars 5 forks source link

Implemented interface for text-links #16

Closed FaFre closed 3 years ago

FaFre commented 3 years ago

Just added a small interface to make it more comfortable to work with text links from outside.

CXuesong commented 3 years ago

Thanks for your PR! I got your point, but when thinking about your goal twice, I'm wondering if it makes much contribution. Before going through my justifications, could you please share with me a bit with your actual usage scenario? Perhaps we can find another way to make your days better 😀

The major issue here is that

For Target property, suppose you have an arbitrary instance of ITextLink, you cannot really tell what target this Target is. Consider members of some other interfaces like IServiceProvider.GetService or ICollection.Add. They have self-contained definition, i.e. you can declare that certain class has certain trait by implementing some interface. But for ITextLink, I can see no such a trait except that "the implementation of ITextLink has a Target property that returns Run".

Then what you are proposing is basically a "Type Class" (c.f. dotnet/csharplang#110, or interface in TypeScript). Such kind of feature is useful especially because it does not require the classes to explicit implement an interface. As mentioned at the very beginning of Interfaces section of TS handbook

One of TypeScript’s core principles is that type checking focuses on the shape that values have. This is sometimes called “duck typing” or “structural subtyping”. In TypeScript, interfaces fill the role of naming these types, and are a powerful way of defining contracts within your code as well as contracts with code outside of your project. [1]

But please also note that C# originated from the concept of OOP. I suggest that for now we do not add the interface until we can see there is enough common traits that we can abstract based on, since OOP is representing objects with abstrations (classes and interfaces).

FaFre commented 3 years ago

Thank you for the detailed response! You made me overthink some stuff. I'm extracting all links from a page and store them inside a collection. To make it easier I introduced ITextLink after reading over IInlineContainer. Now, I refactored my code to handle links in a wrapper-class, which also generates the full URL's for the wiki-internal links. I reverted the changes from this PR locally, the benefits of this contribution are probably quite low as you said.

CXuesong commented 3 years ago

Sometimes the line is pretty blurred. This is why I thought it could be helpful if you can share your usage scenario.

I have been wondering all the time "how are you consuming this ITextLink.Target without knowing the type of the ITextLink implementation?" I guess you should be aware of whether the link is wikilink or external link...

Now, I refactored my code to handle links in a wrapper-class, which also generates the full URL's for the wiki-internal links.

Okay so with some C# 8 language features (basic pattern matching), you can write something like this

(string text, string target) ParseLink(InlineNode linkNode) {
    switch (someNode) {
        case WikiLink wikilink:
            return (wikilink.Text.ToString(), GetFullUrl(wikilink.Target.ToString());
        case ExternalLink externalLink:
            return (externalLink.Text.ToString(), externalLink.Target.ToString());
        default:
            throw new ArgumentException("Expect link node.", nameof(linkNode));
}

Note that I'm using InlineNode for the input parameter, which is the nearest common ancestor of WikiLink and ExternalLink. We can do nothing more than that and that's the inherent limiation of the current C# language. TypeScript supports type union so we can write something like this

parseLink(linkNode: WikiLink | ExternalLink): [text: string, target: string]

But no, not for C#. We need to wait for dotnet/csharplang#3737, instead of introducing interfaces only in order to represent such type union rather than an abstraction of certain trait. Otherwise things could get very awkward.

As for IInlineContainer, it is an abstract class at the beginning until cc4475a89badf87e4f57b1b77fb18bb3ead5ed3e. When writing InlineContainer class, I wanted something (abstract class / interface) to represent "something that contains InlineNodes", because I do need some methods that can prepend/append InlineNode (e.g. some plain text) to an existing ListItem, Heading, Paragraph, Run, etc. https://github.com/CXuesong/MwParserFromScratch/blob/5f47c6f1ab0170e34248132b230181290b8ad47d/MwParserFromScratch/Nodes/Wikitext.cs#L78-L106

In the implementation of the methods above, I care about what are in the InlineNode collection of the nodes instead of exactly what type the nodes is. It looks very similar to "duck type", but there is a self-contained definition of IInlineContainer.Inlines. Because it's simply "a collection of InlineNodes".

If your usage scenario is similar to this, introducing an interface should help. But then that interface could be IRunContainer, which does not even aware whether the Run is a Target (wikilink / external link) or something else. And it does not make much contribution, either, as there is currently no downstream requirement for such an interface.

FaFre commented 3 years ago

Yes, I also thought about the detection of the actual link-type, since the target is handled completely different. A interface would be not a clear approach here, that's also why I wrote a wrapper class, I'm much more happy with this solution now. Thanks for taking your time to explain all those things!