2.0.0 - Githubissues

Big PR to track everything for planned changes towards 2.0.0—please leave reviews here (preferably by adding a review to the relevant code site), or make PRs targeting the 2.0.x branch if you'd like to help.

I don't plan on making any more changes here directly, rather changes can be merged into this via separate dedicated PRs.

Hey, thanks for the heads up over at the october-repo. I understand that you need reviews and honestly, I haven't looked at the PR because I don't have such a great knowledge of this repository's source, but if it helps, I'll see to managing a review around Christmas, even a short one.

Thanks so much @nathanlesage. Even a review from the prospective of a user of the API would be helpful, sans a full on code review which I realise would be quite a substantial task.

There's sort of a summary of the motivation for changes in a comment here and a second one overviewing some changes here, both these from the same thread which discussed a potential feature release 1.8.

A big problem I'm looking to solve is to draw a clear API boundary for the equivalent of "extensions" to utilise in the context of 2.0. Unfortunately a lot of extensions to Parsedown 1.x ended up depending on protected behaviour, which meant it was very hard to guarantee new features (or even bug fixes in some cases) were non-breaking even with best attempts at adding compatibility layers where internal behaviour had changed. This was made especially problematic because this way of doing things was essentially per-instructions when looking at the Wiki on making extenstions.

There are also a lot of bugfixes either pulled from that 1.8 thread, found while rewriting the library and referencing spec, adding more tests, or stumbled upon by mutation testing (e.g. I added a test to catch a specific mutation and ended up finding a bug).

Also some feature additions I'm quite pleased with, e.g. a recursion limiter which I was able to add using a Configurable, which is how I'd envision others add custom behaviour like setters or preserved parsing state. Or another example: DefinitionBook is a Configurable which inline Links can utilise to lookup link definition data (this was previously an array stored in the core Parsedown class).

Heya,

so I just spent half of the day for an intensive code review, and I'm pleased to tell you that I'm through with it! Please remember that this is my very first code review, so if I did anything wrong or if something is still missing, please tell me! 🙈 Furthermore, I'm not completely familiar with the codebase and conventions you're using, so please, if something I remark contravenes these, just ignore that! :)

Most changes concern semantic sugar for naming conventions, so it ultimately boils down to taste. Some were somewhat bigger, so first a quick rundown:

In General

I really like how the code is built conceptually, with a good AST and clean, separated functionality.
However, some naming conventions, especially for some setters and getters are a little bit confusing. In general, I feel like adding certain expressive keywords such as "has", "get", and "set" to the beginning of certain functions would help readability of the code. But I'm coming from a JavaScript/NodeJS background, where this is extremely common.
Concerning function naming conventions, I think that using continuous forms of verbs (appending instead of append) is difficult, because after all, this is very uncommon and if you call a function, it will complete running before any other expression runs; I've included specific examples below!
All Block and Inline elements reference the StateRenderable class which is neither accessed nor otherwise implemented by these, as the class is already being used upstream in the Component class, which these inherit. While I am in no way an authority on how PHP loads classes, I get the impression that we could remove these references, plus some more (I could imagine that some of the classes in the top simply were forgotten during the extensive refactoring). Plus, I spotted some other lone remnant classes that were not being used by the classes I inspected.
While I understand now your idea of implementing the RecursionLimiter as a Configurable, I have my doubts that this should be like this. Personally, if I understand the RecursionLimiter correctly, it's a bail-out option for the parser to not enter infinite loops, so I feel as if this should be a core class (because in the end its functionality also supersedes the one of other Configurables, it has a really huge priority in the call stack). But I mean as you explicitly stated that you wanted to implement it as a Configurable, I'm not persisting on that issue :)
Additionally, I saw that a lot of classes' functions do not manipulate themselves, but actually re-create themselves with the new option (especially in the State-class, see below for examples). I always have the feeling that this might increase computational burdens because it basically re-parses everything that's already correct, and I think that having two lines (one to apply the change, and one to return $this) is to be preferred over re-instantiating the same object over and over again. Or do I miss the elephant in the room with this one?

Furthermore, I ran the PHPUnit and it said that approx. 192 tests were failing, but if I remember correctly, at some point I saw someone saying that some specific tests still fail, which is fine, right? Just wanted to drop that here!

Now to some internals -- first and foremost, I did not closely inspect EVERYTHING, beause after a while, many of my remarks were recurring (especially naming conventions), so here now all my notes I took during the review, unfiltered:

In file ./Parsedown.php:

text()
- I would recommend to add an Alias "renderMarkdown" for semantic clarity. Calling "Parsedown->text()" is a little bit less expressive. In the end, what text() does is run a full cycle from the input to the final HTML output. Maybe even toHTML as an Alias …? (This is only something I personally wanted to have with Parsedown for years now, so it's not that important, especially given the fact that we need the text-method to persist for backwards-compatibility :D)

In file ./State.php:

state()
- Recommend removing this function, both because it is a State object and to expect $state->state() to only return $state seems a little bit counterintuitive. Furthermore, what does one need a StateBearer-class for, other than encapsulating a State-object? I assume, by removing the state()-function one would need to also remove that class, right?
setting()
- This function name also sounds a little bit inexpressive. If I understand the function correctly, it adds one Configurable to the State, so wouldn't it be much easier to understand if you call State->addConfigurable($myConf)?
- Additionally, wouldn't it be computationally easier to simply array_merge the new configurable to the already existing state array ...?
mergingWith()
- I would strongly recommend to rename it to "mergeWith" (this pattern also exists in other files, and it sounds really weird to me).
- Furthermore, as well as the setting()-function: Wouldn't it be easier to simply array_merge the two states, as they after all are only key-value-arrays of Configurables?

In file ./Parsing/Lines.php:

__construct()
- Same as with the Context::__construct() -- is it necessary to both save the trailing line break characters and their count? I see that this will enable Parsedown to fully reconstruct the input Markdown, but as whitespace on a line has no meaning in the context of Markdown, my neurotic self somehow wants this gone...
fromTextLines()
- The variable $sequentialLines should be renamed to $preceedingEmptyLines (or similar) for clarity
none()
- Maybe rename to createEmpty() for clarification?
isEmpty()
- Wouldn't it be computation-wise better to check for the boolean var "containsBlankLines"?
Contexts()
- Maybe rename to "getContexts"? (But I'm really not sure if getters/setters are semantically used throughout the code! I just stumbled upon the uppercase first letter.)
trailingBlankLines()
- Same (with "has") ...?
appendingBlankLines()
- Strongly recommend to rename it to "append" instead of "appending".
- Signature mismatch: It does not return $this, but a new, cloned Lines-object.
- Line 136: It assumes that the newly created $NextLines object does in fact contain blank lines, but I would recommend to copy the $NextLines->containsBlankLines-property over.
- Also, I have the question why the appending-functions do not append the lines to themselves, and instead return a new Lines-object ...? On a practical level, the only time I would like to append the lines to the already existing object; or is there a use-case I'm missing here?
appendingTextLines()
- Strongly recommend to rename it to "append" instead of "appending".
appendingContext()
- Strongly recommend to rename it to "append" instead of "appending".
- Recommend to simplify the if-statement to set the containsBlankLines with an inline-expression
- The resetting of the trailing blank lines seems a little bit counterintuitive to me, as this does not happen when you append TextLines (which, strictly speaking, are the same with a little bit less information), so if I append TextLines, the trailing blank lines are preserved, while when I add a context, they are omitted.

In file ./Parsing/Context.php:

__construct()
- Is it really necessary to also save the $previousEmptyLinesText? As far as I am concerned, it seems that it can only contain linebreaks, which are more efficiently saved using the count variable.
- Maybe exchange the "previous" in the variable and function names with "preceeding" (or semantically similar), because the functions and variables will only be accessed in context of a Context object, so $myContext->preceedingEmptyLines makes more sense then $myContext->previousEmptyLines (the latter might refer to all previous blank lines, even those from former contexts). But I agree this is maybe borderline hair splitting on my part!

In file ./Configurables/RecursionLimiter.php

increment()
- I've mentioned this before, but is the benefit of only having one line worth it to always re-create objects in memory, or wouldn't it suffice to actually only increment the pointer?

Finally, I tried to set up a fresh install of OctoberCMS, but since Catalina, it won't install on my machine anymore, so I couldn't really try it out :( Also I didn't take notes on all of the classes, but I've had a thorough look at them, and could not spot anything else than other instances of what I've just mentioned with these core classes!

I hope that this information helps, and I would like to apologise for the wall of text!

I have to agree to multiple of @nathanlesag and I strongly agree with changing text() to toHTML() in Parsedown.php

And a little wish from me, if you make these changes would you please write back, so I can begin the development of my plugins to work with 2.0 when you release it

Hi @nathanlesage,

Apologies for not getting to this sooner—time's been a bit scarce recently and I wanted to be able to give your review a proper run though. Thank you so much for taking the time, and going to the level of detail you have!

I'll structure my reply similarly to how you've left your review, to make navigation of the individual points I'm replying to a bit easier :)

In General

Concerning function naming conventions, I think that using continuous forms of verbs (appending instead of append) is difficult, because after all, this is very uncommon and if you call a function, it will complete running before any other expression runs; I've included specific examples below!

This is something I've picked up from Swift, so is sort of intentional:

https://swift.org/documentation/api-design-guidelines/#name-according-to-side-effects

Name functions and methods according to their side-effects

[...] When the operation is naturally described by a verb, use the verb’s imperative for the mutating method and apply the “ed” or “ing” suffix to name its nonmutating counterpart.

Mutating Nonmutating

x.sort() z = x.sorted()

x.append(y) z = x.appending(y)

Mutating	Nonmutating
x.sort()	z = x.sorted()
x.append(y)	z = x.appending(y)

The idea here is to form a distinction between whether a change occurs in-place (mutating) or whether it occurs only in the returned result of the function being called (non-mutating). Most things in the 2.0 branch are immutable (I think I've made an exception for result caching), and so perhaps this distinction could be dropped. I do quite like the idea of giving a hint as to the side-effects of a method through its naming though (especially absent any language-level aids for gauging this).

However, some naming conventions, especially for some setters and getters are a little bit confusing. In general, I feel like adding certain expressive keywords such as "has", "get", and "set" to the beginning of certain functions would help readability of the code. But I'm coming from a JavaScript/NodeJS background, where this is extremely common.

There shouldn't be any (mutating) setter functions in the codebase, and so I think we'd end up in a situation where almost every property associated method is prefixed with "get"? Perhaps I'm missing something here though, is there some ambiguity about what $Image->url() would be expected to do over $Image->getUrl()? In the situation where there are arguments being passed I would tend to agree with adding more detail, but when there are no arguments and the function corresponds directly to the property name, is this actually clearer? In an idea world, properties could be marked as non-mutable and it would be possible to make them public directly without throwing side-effects to the wind :)

Did you find any examples missing a "has"? I'd definitely be interested in fixing those!

All Block and Inline elements reference the StateRenderable class which is neither accessed nor otherwise implemented by these, as the class is already being used upstream in the Component class, which these inherit. While I am in no way an authority on how PHP loads classes, I get the impression that we could remove these references, plus some more (I could imagine that some of the classes in the top simply were forgotten during the extensive refactoring). Plus, I spotted some other lone remnant classes that were not being used by the classes I inspected.

I think you're referring to the use statement at the top of the files? You're correct here, I have a fixer running meant to remove stuff like this but I think it has avoided removing that line due to the function stateRenderable sharing the name :)

That said, I'm thinking it might be a good idea to push the return types back up to being StateRenderable, rather than the specific thing being returned. The idea here is that this allows the return type to be to be changed without breaking the API (due to an unnecessarily specific return type being declared).

While I understand now your idea of implementing the RecursionLimiter as a Configurable, I have my doubts that this should be like this. Personally, if I understand the RecursionLimiter correctly, it's a bail-out option for the parser to not enter infinite loops, so I feel as if this should be a core class (because in the end its functionality also supersedes the one of other Configurables, it has a really huge priority in the call stack). But I mean as you explicitly stated that you wanted to implement it as a Configurable, I'm not persisting on that issue :)

The idea with implementing it as a configurable is that it uses the "condoned" method of preserving parsing state. Rather than accessing special "magic" properties in the core class (or worse, global ones), everything now has the ability to add and use configurables as they please (and in a way where mutation of those configurables has a clear scope).

Additionally, I saw that a lot of classes' functions do not manipulate themselves, but actually re-create themselves with the new option (especially in the State-class, see below for examples). I always have the feeling that this might increase computational burdens because it basically re-parses everything that's already correct, and I think that having two lines (one to apply the change, and one to return $this) is to be preferred over re-instantiating the same object over and over again. Or do I miss the elephant in the room with this one?

The elephant in the room here is mutability (and localisation of it) :) By returning a new version, the caller always has a clear copy of both the original and the modified copy (with no unexpected surprises). If the objects were to mutate themselves, it's not always obvious when a change has occurred (and to the extent of the scope). Without immutable objects, and because objects are passed around quite a lot, there might be situations where context A passes an object to context B, which modifies it for its own working purposes. But because objects in PHP are passed by handle, this also modifies context A's copy which might result in unexpected behaviour.

Furthermore, I ran the PHPUnit and it said that approx. 192 tests were failing, but if I remember correctly, at some point I saw someone saying that some specific tests still fail, which is fine, right? Just wanted to drop that here!

Hm, that is odd actually—I'm not getting any failing tests. There is a mode where we run against all CommonMark tests (of which some fail due to not being fully compliant), we then cache and commit the passing ones to prevent regression. Are you using the version of phpunit download by the project's composer settings (vendor/bin/phpunit), or are you using a different one?

In file ./Parsedown.php:

I would recommend to add an Alias "renderMarkdown" for semantic clarity. Calling "Parsedown->text()" is a little bit less expressive. In the end, what text() does is run a full cycle from the input to the final HTML output. Maybe even toHTML as an Alias …? (This is only something I personally wanted to have with Parsedown for years now, so it's not that important, especially given the fact that we need the text-method to persist for backwards-compatibility :D)

(a72455c) I like this idea, text is a bit unclear. I think I prefer toHTML (or maybe toHtml?) to renderMarkdown (maybe I'm just being a lazy typist here :) ).

In file ./State.php:

state()

Recommend removing this function, both because it is a State object and to expect $state->state() to only return $state seems a little bit counterintuitive. Furthermore, what does one need a StateBearer-class for, other than encapsulating a State-object? I assume, by removing the state()-function one would need to also remove that class, right?

This is due to the StateBearer interface (I might rename this). This would be the interface that I'd envision today's "extensions" implementing, in order to arrive at a sort of packaged initial state for parsing (which contains, among other things, which type of parsing will be done). Because there's an obvious way for State to directly implement this (return itself) I thought it might be nice to be able to pass that directly into Parsedown's constructor too (instead of being forced to create a wrapper class).

setting()

This function name also sounds a little bit inexpressive. If I understand the function correctly, it adds one Configurable to the State, so wouldn't it be much easier to understand if you call State->addConfigurable($myConf)?

Maybe addConfigurable might be a better name here (though it doesn't quite form an additive step). State is essentially a localised singleton store. So while the objects it store's aren't singletons, there is guaranteed to be one (and only one) of each one in the store. In this sense it sort of "sets" rather than "adds", but we can probably do better than setting.

Additionally, wouldn't it be computationally easier to simply array_merge the new configurable to the already existing state array ...?

Computationally easier yes, but (especially with this object) a significant loss to stability. State being immutable allows its mutation to be localised without deep-cloning (and mutations that don't propagate everywhere unpredictably are quite important in a few areas of the code).

e.g. the removal of autolinking Urls from subparsing (and only subparsing) after parsing a link https://github.com/erusev/parsedown/blob/0a6408043f052f989a2be1c6dd75edefadbe18b2/src/Components/Inlines/Link.php#L135-L137

In file ./Parsing/Lines.php:

__construct()

Same as with the Context::__construct() -- is it necessary to both save the trailing line break characters and their count? I see that this will enable Parsedown to fully reconstruct the input Markdown, but as whitespace on a line has no meaning in the context of Markdown, my neurotic self somehow wants this gone...

Unfortunately hanging onto the actual break characters is necessary. Computing the count in the constructor is done for the sake of avoiding the recompilation if the count is called multiple times (though this can probably be optimised to be a cached result after first read in Context (8d09320), in Lines this value is already used in the constructor). When outputting fenced (or indented) code blocks the exact characters used on "empty" (whitespace only) lines ends up being important for CommonMark purposes.

fromTextLines()

The variable $sequentialLines should be renamed to $preceedingEmptyLines (or similar) for clarity

$preceedingEmptyLines makes sense for the body of the loop, but its final value is actually the number of trailing empty lines (when it is used to construct the final Lines object). $sequentialLines is sort of an accumulator of observed empty lines at the current point in the parse, though $sequentialEmptyLines is probably a better name than $sequentialLines (39b8b04).

none()

Maybe rename to createEmpty() for clarification?

We can probably improve on none :) I'm wondering if createEmpty might give the impression of creating an empty line though? The objective here is for zero lines to be present.

isEmpty()

Wouldn't it be computation-wise better to check for the boolean var "containsBlankLines"?

Would this be opposed to the && $this->trailingBlankLines === 0 part of the check? I suppose this could be marginally faster than the comparison to zero? At the very least I could reorder that to be first too, so count is short-circuited in the case that blank lines are present (525b723).

Contexts()

Maybe rename to "getContexts"? (But I'm really not sure if getters/setters are semantically used throughout the code! I just stumbled upon the uppercase first letter.)

(7610eac) Upper first letter is a typo here :) If we decide to use get prefix throughout for getters then I'll add that as part of a bulk refactor.

trailingBlankLines()

Same (with "has") ...?

The method returns the number of blank lines, though it looks like this isn't used publicly anywhere anymore (so can possibly remove this method).

appendingBlankLines()

Strongly recommend to rename it to "append" instead of "appending".

Signature mismatch: It does not return $this, but a new, cloned Lines-object.

The return type here is self, not $this?

Line 136: It assumes that the newly created $NextLines object does in fact contain blank lines, but I would recommend to copy the $NextLines->containsBlankLines-property over.

Because \count($NextLines->Contexts) === 0, the $NextLines must contain at least one blank line because it was built from text (shortest example is that it was the empty string), and so $Lines->containsBlankLines = $Lines->containsBlankLines || $NextLines->containsBlankLines is always equivalent to $Lines->containsBlankLines = true because $NextLines->containsBlankLines === true. It used to be similar to this, but was changed due to the surrounding if statement: https://github.com/erusev/parsedown/commit/a396fccace3dddf06fb21561566700c279162d09 (actually found this one through mutation testing, and was initially a bit confused when I saw the assignment could be mutated without detection by the test suite).

Also, I have the question why the appending-functions do not append the lines to themselves, and instead return a new Lines-object ...? On a practical level, the only time I would like to append the lines to the already existing object; or is there a use-case I'm missing here?

Again this is to do with mutability. To give an example, Parsedown itself uses a Lines object. If Parsedown were to hand this down to a particular block for parsing, and the block appended some lines to it for its own working purposes, Parsedown's copy would also be modified. For this reason immutable objects are used everywhere, so that copies only need to be made when changes are made (and not every time you pass an object to a new context). For this reason mutations (through redefining of a variable) are localised to small function/object scoped contexts, rather than propagating in unexpected ways.

appendingContext()

Strongly recommend to rename it to "append" instead of "appending".

Recommend to simplify the if-statement to set the containsBlankLines with an inline-expression

This is subtle, but not equivalent to the condition in the if on its own. If $Lines already contains blank lines, but the Context being appended doesn't have any preceding blank lines, then an inline statement $Lines->containsBlankLines = ($Context->previousEmptyLines() > 0) would change the true to false, even though no blank lines were removed.

Best we could do with an inline expression would be $Lines->containsBlankLines = ($Lines->containsBlankLines || $Context->previousEmptyLines() > 0), which is quite a long line. Do you think it is still worth inlining at this stage? Potentially it is actually slightly clearer because it states both criteria explicitly?

The resetting of the trailing blank lines seems a little bit counterintuitive to me, as this does not happen when you append TextLines (which, strictly speaking, are the same with a little bit less information), so if I append TextLines, the trailing blank lines are preserved, while when I add a context, they are omitted.

When you append text lines, these might contain multiple Contexts (a Context is a single non-whitespace only line, paired with any whitespace only lines that occurred before it). For this reason, when appending text lines, a new Lines object is used to parse this block of text, which also parses any associated trailing blank lines at the end (this information is copied across when merging with the current Lines).

When you append a Context it, by definition, doesn't have any information about trailing lines. For this reason, any trailing lines that were in place are prepended to the lines in the Context being added, but because the Context can't itself contain information about any lines after itself (and it is now the last Context), there are no blank lines at the end (hence the reset).

In file ./Parsing/Context.php:

__construct()

Is it really necessary to also save the $previousEmptyLinesText? As far as I am concerned, it seems that it can only contain line breaks, which are more efficiently saved using the count variable.

Yes, if some Contexts are within a code block, it is important to preserve the exact text which was present. Following on from your previous suggestion, I've delayed (8d09320) actually computing the count until it is actually used (at which point it is cached).

Maybe exchange the "previous" in the variable and function names with "preceding" (or semantically similar), because the functions and variables will only be accessed in context of a Context object, so $myContext->preceedingEmptyLines makes more sense then $myContext->previousEmptyLines (the latter might refer to all previous blank lines, even those from former contexts). But I agree this is maybe borderline hair splitting on my part!

This is a nice semantic clarification (a6c17f4) :) This aligns closer with the "trailingBlankLines" wording used in Lines too (i.e. it doesn't read "nextEmptyLines").

In file ./Configurables/RecursionLimiter.php

increment()

I've mentioned this before, but is the benefit of only having one line worth it to always re-create objects in memory, or wouldn't it suffice to actually only increment the pointer?

This is actually a great example of where immutability saves us from an unexpected failure :) If I make the following change:

  /** @return self */
  public function increment()
  {
+     $this->currentDepth++;
+     return $this;
-     // return new self($this->maxDepth, $this->currentDepth + 1);
  }

then we end up with 543 test failures :p

This is because we end up sharing a single recursion limiter across different parsing depths, and the recursion limiter is incremented each time we increase the depth.

Imagine we have two blocks, each with their own content that needs parsing. After we parse the blocks (parse level 0), we increment the recursion limiter when we enter parsing of the contents of the first block. When we parse the contents of the second block, we also increment the recursion limiter. Because we're referring to the same recursion limiter each time, we end up incrementing it twice, even through we only went one level deep. If we had 100 blocks, this very quickly diverges from the value it should be. By having an immutable recursion limiter, we only increment the copy that we're passing down to "deeper" parsing layers, and so the value is incremented once (and only once) per parsing layer (as expected).

[...]I didn't take notes on all of the classes, but I've had a thorough look at them, and could not spot anything else than other instances of what I've just mentioned with these core classes!

I hope that this information helps, and I would like to apologise for the wall of text!

Thank you again for taking the time to look through! I've made some commits (noted inline) to address what I can immediately from your feedback. Please feel free to argue with any of my responses :) In some cases I've tried to clarify the reasoning for why something has been done the way it has, rather than making a change straight way. If you still think a change is worth making in these cases, we can definitely look at doing that though! :)

Hey, whoa thanks for the extensive answer! :D Here's an also-long answer, but I tried to boil it down :)

(TLDR at the end)

The elephant in the room here is mutability (and localisation of it)

TADAAAAAA 🎉

In fact, I think that a lot of my complaints have indeed to do with mutation/immutability. I have to confess that I'm extremely cautious of having more than one copy of a certain big object (such as a full parsing tree) in memory, but I think this comes both from my times as a young child when we had like 8 megabytes of RAM, and all literature on programming I read told me "OMG DON'T ADD THIS VAR IF YOU DON'T NEED TO!!" which somehow burned itself into my head. Also, in JavaScript contexts, immutability is close to impossible so one doesn't even bother to try :D (If you want to see how to return minimal object copies, here you go)

This explains a lot of the stuff, and so even the appending-form makes sense. Albeit, I'd argue that "appended" (or even better "withAppended," so that you'd arrive at something like "withAppendedContext" to hurt your keyboard extra, hehehe) would be better (because then you have the semantic clarity of immutability without implying the function is always running), but again: borderline hair splitting :D

Concerning the … *looks inside the original post* … ah, assumption of lines in the Lines.php-class … if I understand this line correctly, wouldn't it make more sense to simply check if $Lines->trailingBlankLines > 0…? I am sure this breaks things, so please don't take my word for it, I am just also insecure concerning assumptions!

Do you think it is still worth inlining at this stage? Potentially it is actually slightly clearer because it states both criteria explicitly?

Well, do you really need $Lines->containsBlankLines = ($Lines->containsBlankLines || $Context->previousEmptyLines() > 0)? … aaaah nononono forget that. You're right, I just realised that. It's late :D

then we end up with 543 test failures :p

… well, it would not consist of simply leaving the RecursionLimiter that way, but to decrease the limit again once you return to level 0 :P You would need to add a decrement each time, hehehe

My only concern was that if you have multiple copied state objects, then you might end up with something like 2048 nested objects, and I don't really know if this really is necessary…? I only think that if you reach more than 256 nested elements in Markdown, you both deserve a prize and should do the walk of shame, right? 🤷🏼‍♂️ But again, this has to do with my tendency to double- and triple-check everything because I can't trust any computer or user …

TLDR

In the end, what caused 99 percent of my confusion was immutability, what I completely did not have in mind — here you can see how much programming paradigms (and even single programming languages with their specific behaviours) can influence your thinking!

I trust you with all of the stuff, because once I mentioned it you had a second look at the (potential) issues, and if it still looked good for you, this means that it is really good! I really appreciate all your hard work, because if I understand it correctly, you've done the lion's share of the work yourself, and I think you should be proud of what you've done here. I know myself how hard it is to do something solely in your free time, so please, if you are happy with what you have done, proceed :) My aim was to pose questions (most of which were not really valid, because of the immutability-fallacy I fell for) just to have a second look, and as I already mentioned: I did not spot any obvious mistakes in the logic, and I think if you think it's ready to go out, then go for it! :)

Hope all of that helps, and thanks again! Have a good night!

P.S.: Concerning the PHPUnit: Yes, I simply ran a composer install, so it should've installed the correct version; but as I never used PHPUnit before (I like to live dangerously, or, in other words: for long I had no idea what testing was good for …), it might very well be an error on my side! I did run all tests, including the CommonMarks, this is why I was mentioning it!

P.P.S.: Yes, toHtml is also good, as long as there is such an alias :)

Hey @aidantwoods — is there anything new on this issue? :)

@erusev Any thoughts on the direction of the changes in this PR? I'd like to bring a branch of Parsedown-Extra in line with this codebase, get this ready for merge soon—this has probably been left alone long enough :)

The general idea of these changes is to compartmentalise functionality, allow the source code the be analysed for type errors (using vimeo/pslam), and to formalise a lot of functionality that is loosely "supported" for extensions currently, but not easy to work with (especially when changing internal behaviour, which currently carries a high risk of breaking extensions).

To avoid breakage, extensions are now encouraged to utilise a defined API for accessing parsed metadata, and do not need to rely on extracting information from the exact HTML structure being generated (which is subject to change at any time—e.g. to fix bugs).

The changes here also bring some additional features not before possible, for example working with multiple extensions which do not need to know of one-another's existence, user composition of individual elements from many extensions, and the ability for these many extensions to define their own user configurable options or preserve parsing state.

No major changes have been made to the core parsing strategy (the general approach to building blocks and inlines), changes are focused on formalising previous concepts and allowing them to be more readily reusable.

@aidantwoods Looks good! Thanks for the thorough explanation!

Just out of interest, how long do you estimate you are with version 2.0. Is it too soon to begin the development of extensions?

@BenjaminHoegh I've let this linger quite a while, haven't I... 😄

By all means feel free to work on an extension based on these changes. Any feedback you have would actually be really helpful (e.g. if you run into some issue with the new codebase that prevents you doing what you'd like, would be really good to know that so I can make changes if necessary).

Just to track it here, if memory serves I think I had the following on my list of things to do before this could be released:

[x] Update Parsedown Extra to work with this codebase
[ ] Write some documentation about working with the new codebase for extension authors (especially with regard to the State object, and making custom "configurables"

@BenjaminHoegh I've converted ParsedownExtra to be compatible with this branch over at https://github.com/erusev/parsedown-extra/pull/172.

Currently, the plan would be for "extensions" in the new world to implement the StateBearer protocol (they don't have to, but it helps with composability).

Using Parsedown with ParsedownExtra would look something like:

$Parsedown = new Parsedown(new ParsedownExtra);
$actualMarkup = $Parsedown->toHtml($markdown);

This would allow multiple extensions to be composed together, so that if there was another extension, say, ParsedownMath, then this could be composed with ParsedownExtra in whichever order the user prefers.

E.g.

$Parsedown = new Parsedown(ParsedownExtra::from(ParsedownMath::from(new State)));

$Parsedown = new Parsedown(ParsedownMath::from(ParsedownExtra::from(new State)));

Provided all extensions implement the StateBearer protocol, then these can be composed indefinitely (in practice, though, this could result in odd behaviours of course).

A State object incorporates Blocks, Inlines and some additional render steps, and any custom configuration options that the user might want to set. This can fully control how a document is parsed and rendered.

When writing the docs for new extensions to follow, I think it is a good idea to encourage extensions to document the state modifications they do so that a user can add these manually if they so wish. This, ultimately, will allow a user fairly granular control to try to make extensions interoperable (especially if extensions are large and contain quite a few behaviours).

For instance, if a user only wanted the "abbreviations" feature from ParsedownExtra and nothing else, then they could manually construct:

$State = new State;

$BlockTypes = $State->get(BlockTypes::class)
    ->addingMarkedLowPrecedence('*', [Abbreviation::class])
;

$RenderStack = $State->get(RenderStack::class)
    ->push(Abbreviations::expandAbbreviations())
;

$State = $State
    ->setting($BlockTypes)
    ->setting($RenderStack)
;

$Parsedown = new Parsedown($State);
$actualMarkup = $Parsedown->toHtml($markdown);

In the case of ParsedownExtra though, I've made each feature its own StateBearer, so one can just write:

use Erusev\ParsedownExtra\Features\Abbreviations;

$State = Abbreviations::from(new State);

$Parsedown = new Parsedown($State);
$actualMarkup = $Parsedown->toHtml($markdown);

This IMO, would be the ideal pattern for extensions to use: which allows very simple granular composability by the user, allowing them to pick just one feature, or the entire package of features with the extension as they wish.

@aidantwoods is there a way to manipulate with TList without having to rewrite it?

There isn't. This is to make sure that mutations are localised and don't end up being in places where they weren't intended.

As a plus to this approach, this means that the Lis that make up the TList can all be passed by reference if they are unmodified without needing to clone. i.e. different TLists can safely reference the same underlying Lis because they aren't going to change unexpectedly.

This allows things like copying a TList to be done with a simple assignment, and allows those copies to be modified (actually overwritten) without risk of modifying another copy.

Depending on what you're trying to do, could it be made easier if there was a constructor or more getters made available for TList?

Depending on what you're trying to do, could it be made easier if there was a constructor or more getters made available for TList?

Just trying to make tasks using the - [] my task

erusev / parsedown

2.0.0 #708

In General

In file ./Parsedown.php:

In file ./State.php:

In file ./Parsing/Lines.php:

In file ./Parsing/Context.php:

In file ./Configurables/RecursionLimiter.php

In General

Name functions and methods according to their side-effects

In file ./Parsedown.php:

In file ./State.php:

In file ./Parsing/Lines.php:

In file ./Parsing/Context.php:

In file ./Configurables/RecursionLimiter.php

TLDR