New, unified and future-proof tag syntax for Gollum

gollum / gollum

A simple, Git-powered wiki with a local frontend and support for many kinds of markup and content.

MIT License

13.8k stars 1.58k forks source link

New, unified and future-proof tag syntax for Gollum #993

Closed SkyCrawl closed 8 years ago

SkyCrawl commented 9 years ago

@dometto @bartkamphorst @sunny Hey guys,

just delved into refactoring/improving the filter chain code and I bumped into another issue which, this time, probably requires your consent/cooperation :). I believe it's time to devise a new "markup" for Gollum's own features.

The problems

Lua's double bracket syntax. See here.
MediaWiki's syntax. Gollum's tag syntax was probably taken from MediaWiki and adjusted not to collide too much. I see how this was a good idea and you really did manage to make the tags' syntax sane but I think there are still 1 or 2 conflicts and I think it would be better to devise our own syntax, especially because the number of features is steadily growing. EDIT: Gollum's tags also don't seem to be rendered for AsciiDoc, by design...
Aggregation of tags. The current implementation of filter_tag.rb is a bit of a mess and I'd like to separate it into many individual filters because with the new approach I'm implementing, everyone using the gollum-lib gem will be able to customize the order of filters (among being able to customize a lot of other things too). This allows very nice per-user flexibility, should something related go unexpectedly wrong. I know this introduces quite a bit of slowdown when rendering pages but overall, it shouldn't exceed the speedup gained by the refactorizations I'm making. Furthermore, this is already a big step in my longterm plans - filter chain performance can be increased quite a bit if we implement a state-machine matching patterns "in parallel" and calling hooks when a pattern is matched. With this facility, we can render all of Gollum's markup with a SINGLE traversal of the source markup as opposed to how things are at the moment - it is traversed like 15-20 times... It shouldn't even be too complicated if we design a nice UNIFIED syntax for Gollum's own features and forbid nested feature tags :). But we can only do this if we also forsake "contextual" tags and tags that depend on expansion of other tags. Currently, there are no such tags, with the exception of TOC which will likely be eliminated in the new version of gollum-lib due to being obsolete.
I'd like to strictly define which tags can be "inline" and which must be defined as "block". For example, page link is a typical "inline" tag whereas table of contents (despite I plan to remove this particular tag because it is not needed anymore) is a typical "block" tag since it has no sense to embed TOC in the middle of a sentence...
Ununified syntax.
- Currently, page link tags start with "text placeholder" whereas image tags start with links.
- WSD, UML, Macros, [[include:]]... all of this COULD look very similar :). Macros can EASILY turn into filters and like I said above, users will be able to completely customize the order of filters in the version of gollum-lib I'm making...
- Several cases (probably corner cases) are not distinguished properly... for example, whether I want to link an image or embed an image, without any further specifications like alt etc.
I'd like to implement more <code> filters: one that embeds GitHub gists, one that embeds files from GitHub. Currently, an attempt for the latter is already implemented but it is broken :).
The solutions

All of the above problems, in my opinion, call for a new, unified and unique syntax for Gollum's own features that would, hopefully, be future-proof. Also, I think macros should be turned into filters because I don't see a real difference :).

That said, I'd like to start a discussion for the following observation: Tags to link pages, files and embed images may not be necessary... many supported markups provide their own syntax and facilities. Also, I think Gollum should primarily focus on "added value" to all markups, rather than "unifying them". This is just a thought... I'm in no way against keeping the tags :).

A new syntax, from the top of my head, probably quite a wrong one:

[Gollum[:include(page-link)]

[Gollum[:wsd(style)
   ... code
]
[Gollum[:uml(args?)
   ... code
]

[Gollum[:gist(USER/GIST)]

[Gollum[:rawgit(URL)]

Feel free to praise/criticize/comment on anything said or suggest your own new syntax :). The only drawback that I can see in the whole idea of a new syntax is backward compatibility and it is certainly a very valid/serious point. On the other hand, the versions of gollum and gollum-lib I'm trying to make span a lot of various fixes/improvements and their backward compatibility is already not absolute...

dometto commented 9 years ago

Hi @SkyCrawl! Good suggestions, thanks for the work on this!

Aggregation of tags. The current implementation of filter_tag.rb is a bit of a mess and I'd like to separate it into many individual filters because with the new approach I'm implementing, everyone using the gollum-lib gem will be able to customize the order of filters (among being able to customize a lot of other things too)

The tag filtering is indeed a mess, as you said -- I noticed this looking at the performance problems you mentioned in #839. So all for cleaning that up! :+1: Notice that the filter chain ordering is already configurable (although I don't think the front-end has a user-friendly way to take advantage of this possibility yet).

Tags to link pages, files and embed images may not be necessary... many supported markups provide their own syntax and facilities. Also, I think Gollum should primarily focus on "added value" to all markups, rather than "unifying them"

I largely agree, but I think there should still be gollum-specific tags for linking to wiki-internal files and pages. Some supported formats don't have any way of defining internal links, after all, and it's a nice feature that users don't have to write HTML in order to link to other parts of the wiki.

One thing we should consider is the status of code blocks. Gollum currently implements it's own filtering of tilde-style and three-backtick style codeblocks in all markups. A case could be made to just leave the definition of codeblocks to the markup rendering gems.

filter chain performance can be increased quite a bit if we implement a state-machine matching patterns "in parallel" and calling hooks when a pattern is matched.

I suppose the idea is to implement some sort of simple DSL that allows users to define tags and associated behavior? That would be really nice, and would indeed obviate the need for macro's as seperate creatures. :+1: Or even just allowing the user to configure a pattern and a Proc that is called when the pattern is matched.

I see no problems with the syntax you're proposing, but maybe somebody else is creative enough to come up with an even better style? :-)

nlowe commented 9 years ago

So, @SkyCrawl, going forward, would Macros need to be turned into filters?

SkyCrawl commented 9 years ago

@techwiz24 I think they should, along with devising the new markup. I'm only waiting for consent to do it :).

SkyCrawl commented 9 years ago

One thing we should consider is the status of code blocks. Gollum currently implements it's own filtering of tilde-style and three-backtick style codeblocks in all markups. A case could be made to just leave the definition of codeblocks to the markup rendering gems.

Yes, I noticed that too. It's probably a good idea to have a Gollum tag for inline code (for the same reasons you pointed out about keeping the link tags) but I see no advantage of "preprocessing" this tag for github-markup to render - I say we render it right away :). Users will probably use the source markup's own syntax for inline code most of the time anyway, e.g. backtick for markdown and triple-backtick for github flavored markdown.

I suppose the idea is to implement some sort of simple DSL that allows users to define tags and associated behavior? That would be really nice, and would indeed obviate the need for macro's as seperate creatures. :+1: Or even just allowing the user to configure a pattern and a Proc that is called when the pattern is matched.

Yup :). Filters are, at the moment, a little more abstract than macros but eventually:

metadata filter should always be executed first because other filters may be using this information,
tag filters and macros should be executed next,
render filter should always be executed pre-last, with TOC following it (or the plaintext filter instead of both),
sanitize filter should always be executed last.

I think this top-level order should be fixed and only the second point should be fully customizable. I plan to transform the other filters into inline code that is customizable only partially. That way, we will truly only have filters that define a "tag" and associated behaviour, effectively "unifying" filters and macros :).

I see no problems with the syntax you're proposing, but maybe somebody else is creative enough to come up with an even better style? :-)

Sadly, I do now. The ] "end-tag" is quite a problem... how do we reliably recognize the end of the tag? I suppose the syntax must be changed a bit yet :). Also, we should probably consider, how to parse the tags nicely, including the possibility of having "arguments" (perhaps named?) mixed with "boolean flags".

dometto commented 9 years ago

tag filters and macros should be executed next, render filter should always be executed pre-last

Would executing tags before rendering not have unwanted results, such as parsing gollum-like tags in code blocks?

nlowe commented 9 years ago

While you're on the topic of reworking macros and other syntax, perhaps you could give some consideration to gollum/gollum-lib#125? I'm not quite sure if your current proposal allows for nested macros or not. Just something to think about!

SkyCrawl commented 9 years ago

@techwiz24 The current version of Gollum doesn't support nested macros and the new versions I'm making will most likely not support them either. But this doesn't mean that Gollum can not support them :). If we implement the state-machine I was talking about earlier for performance reasons, we can either make it simple (in-place matching and replacement of patterns that doesn't support nested macros) or stack-based (that DOES support nested "macros" and even "variables" if we want to!). It's quite a bit of work, however, and a very good filter/macro syntax would probably be essential!

@dometto Well, I haven't tested it but looking at the code, the current version of Gollum renders tags anywhere on the page, even in code blocks :). That's why the original programmers added the possibility to escape Gollum's tags with ', I suppose :).

dometto commented 9 years ago

@SkyCrawl, I think the current version first extracts gollum-style code blocks, and only then applies the render filter (and then traverses the filter chain in opposite order, so the extracted code blocks are put back in after the rest of the markup is rendered).

SkyCrawl commented 9 years ago

Ah, right! Totally forgot about the reverse order... Sorry. In the current version, rendering is actually done first whereas in the versions I'm making, rendering is done last (I merged the extract and process methods because:

it seemed to me that, if done right, "single process" filter chain can do the job as well as the old one, perhaps even a bit more logically (the order I mentioned earlier),
Gollum doesn't use any "contextual" or nested tags/macros (and doesn't support them anyway)... again, with the exception of TOC which is getting obsolete :).

But then again, as far as I can tell, the current version of gollum renders tags anywhere, anytime, unless escaped :). That's probably because rendering is done FIRST and github-markup is a bit of a black box :). If it were me, I'd get rid of the tags & macros first because it's more logical and reliable, simpler and faster.

Would executing tags before rendering not have unwanted results, such as parsing gollum-like tags in code blocks?

Maybe :). But as far as I can tell, Gollum doesn't handle these cases anyway and if it does, only a very small handful of cases like this are handled... It's actually not easy and seems to me that the original programmers knew it and passed the issue to the users - provided an escape mechanism :).

SkyCrawl commented 9 years ago

So, @SkyCrawl, going forward, would Macros need to be turned into filters?

@techwiz24 In case you're afraid macros will "disappear", fear not :). Even if I turn them into filters, you will still be able to define custom filters in the new versions :+1:

dometto commented 9 years ago

@SkyCrawl, I agree that a state-machine implementation would be much better, if we can implement it. So much looking forward to your changes. :) We should see how performance is, if it suffers too much, we can always make the ordering of the filters more rigid. The most important thing for me is to keep the internal link tags.

SkyCrawl commented 9 years ago

The most important thing for me is to keep the internal link tags.

Alright, let's keep them :+1: .

SUMMARY

I'd like to receive permission for the following, ideally also from @bartkamphorst :

"Single process" filter chain

The extract and process methods will be merged and this order of filters applied:

metadata filter should always be executed first because other filters may be using this information,
tag filters and macros should be executed next,
render filter should always be executed pre-last, with TOC following it (or the plaintext filter instead of both),
sanitize filter should always be executed last.

Separation of tag filters

Each tag will now have a dedicated filter, instead of a single one for all tags. This will increase flexibility and maintainability a lot and decrease performance a bit. Loss of performance shouldn't be an issue, however, because the speedup gained by the new filter chain (see above) will certainly be higher. And I have other performance optimizations in store (see the above comments for an example of one of them).

Unification of filters and macros

If categories 1,3 and 4 from the above "single process" filter chain section are transformed into inline code (not being filters anymore), we can safely unify filters and macros. This would not only eliminate another complexity from Gollum but also allow us (FURTHER DOWN THE ROAD) to implement a state machine for parsing source markup that:

would probably deliver big performance boost (especially for large pages full of Gollum's tags and macros),
would allow us to support nested tags/macros and variables if we so wanted. For this big step in Gollum's evolution, however, a GOOD (non-conflicting, future-proof, user-friendly, parsing-friendly, definition-friendly) syntax for tags is essential. If we're to support variables as well, it would be de-facto a very minimalistic extensible functional "programming language" of ours :). As such, it shouldn't conflict with pretty much anything (either supported now or in the future) and that's the real CHALLENGE...

EDIT: the last action goes hand in hand with the new syntax. It would probably be better to wait with this until all the other changes are done but if we consent on this, we should still devise the new syntax first :).

bartkamphorst commented 9 years ago

@SkyCrawl Thanks for inviting me to the conversation; my apologies for my late reply. To be honest, I have some reservations about the proposed plan. In brief, I'm not yet convinced of the necessity and priority of this particular overhaul. Yes, I think a state machine (were you thinking of using aasm?) could be an improvement, but in general, I think the backend implementation is relatively stable and maintainable (in contrast to some of the front-end stuff). Introducing a new syntax will be a breaking change for many, many wikis, and I'm just not seeing how the benefits from the proposed changes will weigh up against that. Besides, I think the most pressing issues to work on at the moment concern front-end stuff (kudos for your contributions so far!), getting the git adapters stable (which at the moment is really hampering functionality) and improving search. So I'm not outright shooting your plans down, but rather questioning the timing of the overhaul (especially since such changes will require a lot of user support afterwards).

dometto commented 9 years ago

In brief, I'm not yet convinced of the necessity and priority of this particular overhaul.

@bartkamphorst, you raise some good points, but I think there is one place where an overhaul can be argued to be necessary, and that's in the tag filter, which is really terribly inefficient (see #839) (also also suffers from some problems of consistency that @SkyCrawl mentioned). To get that right we might need to implement the state machine. But of course the point about timing still stands -- do we need to do this now?

nlowe commented 9 years ago

What if we slate the tag / macro / filter engine overhaul for 4.0 because of the high possibility of breakage with individual wiki's? I know it's extra work, but could we consider writing a 3to4 script a-la python's 2to3 to help people prep for the change? I think this would help people transition to the new syntax whenever it gets released.

SkyCrawl commented 9 years ago

@bartkamphorst, you raise some good points

Indeed, thank you @bartkamphorst . As an answer, I'd like to try and summarize the current situation:

Gollum has been around for many years and many "generations" of developers have taken part in it. Looking at the code, almost none of the developers knew Gollum in its entirety - sometimes, I get the feeling they didn't know it much at all. But what's worse, I think none of them really cared to keep the code unified, clean and error-safe. Don't get me wrong... I don't mean to blame anyone for anything and I know that keeping a project this large in that state is quite an undertaking for one or two people... But Gollum claims it's a wiki with a sweet API :). When I look at the code, I see several burnt spots on the cake (API) and a lot of sour afterflavor (in both the frontend and backend). I haven't looked at the underlying git layer yet but to be honest, I think I can already see several great improvements from the backend only.

Because of the reasons above, the code is alarmingly inefficient/fragmented and I think even the design is sometimes, to put it lightly (forgive me, please), not thought out well...

I think the backend implementation is relatively stable and maintainable

Over the years, it was certainly made more or less stable. At a great cost: maintainability. While maintainability may be subject to opinion (because of different programming habits, code structure, skills, etc.), I really think I'm right when I say Gollum is NOT very well maintainable anymore :). I think I know because I'm trying to change it and it takes a lot of time... way more than necessary I think. In my opinion, the first requirement for maintainability is a well designed, layered and documented structure - especially for projects like Gollum, spanning 3 different gems. Then, the second is for the code to be easily understandable/modifiable (on one place, if possible - not throughout the whole application) to anyone, be it the original programmers or someone completely new (like myself). I think that the current Gollum partially fails at both, and especially at the second.

The very reason I started all these changes is because the frontend's CSS was definitely verging on being unmaintainable for pretty much anyone. As I delved deeper into Gollum, I started seeing the above pattern all over again: code duplication/fragmentation, ununified (and often poor) coding style, local bug patching, lacking comments and documentation, inefficiencies... As someone discovering Gollum for himself, sometimes I really had a lot of trouble figuring out what a piece of code was supposed to do (and how the whole thing works together for that matter). Even now, I still don't get a bit of the code, despite I know what it is supposed to do... The discovery process was actually even worse for me because I was a complete Ruby beginner :). My only salvation was programming experience and patience.

Besides, I think the most pressing issues to work on at the moment concern front-end stuff (kudos for your contributions so far!)

That may be true. However, why do you think I started changing gollum-lib so much? Because I couldn't do many of the frontend improvements otherwise :). Much of the page syntax and functionality is hidden away and hardcoded in the backend.

Introducing a new syntax will be a breaking change for many, many wikis, and I'm just not seeing how the benefits from the proposed changes will weigh up against that.

Yes, I know it would be a big step. The goal of this particular improvement is for Gollum to be even much more logically structured, unified and most of all... MAINTAINABLE. In the end, filters and macros are the same thing as far as the current functionality of Gollum is considered. And I think it would take A LOT to convince me otherwise.

In brief, I'm not yet convinced of the necessity and priority of this particular overhaul.

If this refers to the new syntax and unification of filters/macros, I mentioned in the summary that I would also avoid this for now... I haven't made that particular change yet (I wouldn't dare) and I don't plan to so far. But I still wanted to reach a consensus so that I'd know this would eventually be an item in future TODOs and so that I could (perhaps...) make some preparations :). Gollum really lacks a roadmap... The rest of the changes is by far not so drastic and they still affect the underlying git layer very little...

Don't get me wrong... I don't mean to turn Gollum upside down. I only mean to make it more logical, faster, cleaner, readable, usable, customizable, extensible. Simply put: less code, more of everything else, including comments. The way I see things, doing so is the basis for (almost) any comfortable further development. If I was a developer of the whole Gollum, I wouldn't like for changes to the frontend or git backend to expand over to gollum-lib so often. And definitely not, if it meant hunting down and correcting various occurrences of the affected code, which often has its own circumstances...

Note, however, that I'm not asking anyone to wait for me, until I make all of my changes... They're still going to take a lot of time and we can always merge at some point in the future :). Believe me, however, if I say that once I'm done, Gollum will be a lot more "featurefull" and faster than ever, and more fresh/maintainable than it was since a long time ago :).

SkyCrawl commented 9 years ago

@techwiz24

I know it's extra work, but could we consider writing a 3to4 script a-la python's 2to3 to help people prep for the change?

Nice one, I like it :). Then the only remaining thing to do would be to adjust any launch scripts (Gollum's wiki options are getting a little makeover too) and perhaps page titles. Unless of course these changes make it to master before the new syntax and unified filters/macros do, if we even decide to go that way...

bartkamphorst commented 9 years ago

@SkyCrawl I think we are in agreement about many things. You are right that there is room for significant improvement in the backend in terms of efficiency, abstraction and unification of coding style. So of course I would welcome any improvements and cleanup efforts.

My primary concern however is that a major overhaul (by which I mean many big changes all at once) of the code base will also introduce many new issues that will have to be dealt with (which is related to my earlier point about user support). And this is especially so if the changes are breaking backward compatibility (which as you said yourself is a serious concern).

So with the risk of sounding conservative, my position is this. I would welcome improvements to the code, given that

1) they pass the current test suite (minor changes are sometimes unavoidable, but in general they should be kept intact); 2) new code should be covered by tests; 3) changes come in manageable chunks so that we can reasonably review and test them and familiarize ourselves with the code.

Gollum really lacks a roadmap...

Agreed. We have made some efforts to work with milestones in github, but we haven't fully integrated that into our workflow yet, and I'm in favor of being more explicit about our long-term objectives.

I hope this doesn't discourage you; as I said, I really appreciate the contributions you are making to this project.

bartkamphorst commented 9 years ago

@SkyCrawl RE the two-phase filter chain, the advantage of the current implementation is that it reduces the risk of filters getting in each other's way. To give an example, say I have a filter/macro called

<<Airplane()>>

that produces the following ascii-art:

                         |
                        _|_
                       /(_)\
               -------:==^==:-------
                    [[|  o  |]]
------------__________\_____/__________------------
                   _  /     \  _
                  T T/_______\T T
                  | |         | |
                  """         """

With a single filter chain, you run the risk that a subsequent filter would try to parse the middle piece

[[|  o  |]]

as a tag, which is clearly undesirable.

sunny commented 9 years ago

I agree that a new syntax might have some benefits, but I think that backwards compatibility should be kept for as long as possible.

Let's first make a shiny new motor, and then think about easing out of the old syntax by slowly accepting then new one and then make a one-command update system once the new syntax has proven it is working correctly with everybody's use case and after this has proved to worked everywhere only then stop accepting the old one.

As for everything else, the changes sound awesome. If we need to break the gollum-lib API we should make sure the changes are documented and that we keep a semantic versioning. Planning could certainly help with that.

:+1:

Le ven. 10 avr. 2015 21:38, Bart Kamphorst notifications@github.com a écrit :

@SkyCrawl https://github.com/SkyCrawl RE the two-phase filter chain, the advantage of the current implementation is that it reduces the risk of filters getting in each other's way. To give an example, say I have a filter/macro called

<<Airplane()>>

that produces the following ascii-art:
                     |
                    _|_
                   /(_)\
           -------:==^==:-------
                [[|  o  |]]
------------_____/__------------ / \ T T/_\T T | | | | """ """

With a single filter chain, you run the risk that a subsequent filter would try to parse the middle piece

[[| o |]]

as a tag, which is clearly undesirable.

— Reply to this email directly or view it on GitHub https://github.com/gollum/gollum/issues/993#issuecomment-91540293.

SkyCrawl commented 9 years ago

My primary concern however is that a major overhaul (by which I mean many big changes all at once) of the code base will also introduce many new issues that will have to be dealt with (which is related to my earlier point about user support). And this is especially so if the changes are breaking backward compatibility (which as you said yourself is a serious concern).

Yes, I understand that. I don't think there will be so many issues, however (for users, anyway). I'm careful about removing/refactoring the original code. That's one of the main reasons it takes so long :). I think I've even fixed at least one or two bugs by now...

In any case, I plan on to invest a lot of time in testing Gollum :). That was actually the main point I had in mind when I recently stated I would greatly appreciate your help, once I'm done with the most important changes.

Also, the changes I've made so far are not drastic (to users anyway)... I'm sure to create several issues but if we test Gollum well, I'm sure we can fix the very most of them before we release the new version :). My code tends to have few bugs because I'm trying to reuse it and keep things simple and logical. And if there're many issues after the release, I'll of course do my utmost to fix the mess I may make...

1) they pass the current test suite (minor changes are sometimes unavoidable, but in general they should be kept intact); 2) new code should be covered by tests; 3) changes come in manageable chunks so that we can reasonably review and test them and familiarize ourselves with the code.

Agreed. I'm not exactly sure because I've stayed away from tests until now but I think some tests will be obsolete once I'm done... And if not, much of the test code should be obsolete (they should be simpler). Like I said earlier, I think some of Gollum's design is not very good...

Agreed.

This may be a problem... once I'm done, the code will change quite a bit. In general, it should still be similar to the released version but if you'll wish to truly familiarize yourselves with it, you will likely have a lot to go through :). I have high hopes, however, that my detailed changelog and explanations (that I'm trying to write and update as I go) will be enough to let you go through the changed code quickly. A lot of the code will also be quite a bit simpler to read, I hope.

Agreed. We have made some efforts to work with milestones in github, but we haven't fully integrated that into our workflow yet, and I'm in favor of being more explicit about our long-term objectives.

Seconded. I think a section at the end of README is a great place for it. And once I'm done with my changes, I'll already have a nice (subjective) roadmap ready :).

I hope this doesn't discourage you

No way :). There're not many ways to develop quality applications... and that's what I like to do :). I'm haunted by something else entirely - I've almost run out of time to actively develop Gollum for now. School engagements are gaining up on me again :(...

RE the two-phase filter chain, the advantage of the current implementation is that it reduces the risk of filters getting in each other's way. To give an example...

In the original implementation of 2-phase filter chain, the risk has been more or less eliminated because individual filters often used temporal substitution for unique identifiers. I see the point you tried to make with your example. However, isn't the solution actually simple? The only thing you need to do is escape the Gollum tag :). Gollum will notice it is escaped, parse out the escape sequence and "leave the tag alone". It reminds me of how things work with rendering source markup (or the final HTML) in Gollum - sometimes, markdown won't let me use literal backticks so I have to find a way to escape them. Sometimes, a < character prevents the rest of the HTML page from being displayed. The solution is the same - escape the markup... Surely, there is no "right" answer here. Sometimes, even an escape mechanism may not be enough... but I still think that a two-phase filter chain makes things unnecessarily complex because cases where even escape mechanism is not enough are, in my opinion, EXTREMELY rare (and especially if Gollum has a unified, unique syntax). As for me, I consider the speed and simplicity of single-process filter chain much nicer an asset :).

As a counter-example, please consider this: can your example actually be sometimes desirable (e.g. a macro that expands into content with some of Gollum's tags)? My personal answer would be "yes". In case it would be yours as well, please consider this: would you like for a macro to expand into content with another macro? The two questions are probably remarkably similar because, like I've already stated several times, I think that macros and filters are (in the end) the same thing. Anyway, now please consider this: how could we ever support "recursive" filters/macros with two-phase filter chain? Not sure about you guys but personally, I don't see a way that would work for any combination of tags/macros :). The only "variable" that the two-phase filter chain can operate with is the order of filters/macros. And again, I'm not sure about you, but for me, two-phase filter chain makes the order a bit complex to imagine, not to mention how much slower it is. In my opinion, single process filter chain is a much better option. But it doesn't truly support "recursive" filters/macros either...

And finally, let's assume we have single-process filter chain. Please consider this: what happens when we turn it into the state machine I mentioned earlier? How much faster is it? Does it finally support "recursive" tags/macros?

Summary:

Two-phase filter chain:
- complexity (a bit simplified): 2{#filters}{avg. source length}*{#occurrences}
- doesn't fully support "recursive" tags/macros
Single-process filter chain:
- complexity (a bit simplified): {#filters}{avg. source length}{#occurrences}
- doesn't fully support "recursive" tags/macros
State machine:
- complexity (a bit simplified): {avg. source length} + {#occurrences_combined}
- DOES support "recursive" tags/macros fully

Of course, for the state machine, we would probably need a nice unified syntax but... how cool is that? :)

I agree that a new syntax might have some benefits, but I think that backwards compatibility should be kept for as long as possible. Let's first make a shiny new motor...

Definitely agreed. But I'd still like to receive permission for the single-process filter chain as a nice improvement that (in my opinion) has too many upsides and far too few downsides :).

dometto commented 9 years ago

To give an example, say I have a filter/macro called <<Airplane()>>

We need that filter. :smile:

And if not, much of the test code should be obsolete (they should be simpler). Like I said earlier, I think some of Gollum's design is not very good...

Tests could be simpeler but I'm fairly happy about the functional coverage we get from them now (they manage to catch a lot of potential bugs). So I would say (probably superfluously) keep the test suite as it is where possible, with simplification as a future project.

This may be a problem... once I'm done, the code will change quite a bit. In general, it should still be similar to the released version but if you'll wish to truly familiarize yourselves with it, you will likely have a lot to go through :). I have high hopes, however, that my detailed changelog and explanations (that I'm trying to write and update as I go) will be enough to let you go through the changed code quickly. A lot of the code will also be quite a bit simpler to read, I hope.

All for documentation and easier to read code, @SkyCrawl. :+1: Still, I'd like to encourage you to try and build seperate PRs where possible -- also because you mentioned you might run out of time to work on gollum soon. That's a problem I recognize of course, and in general I think it's best to have manageable chunks, rather than a big changeset that keeps growing, and which we might never get around finishing.

But I'd still like to receive permission for the single-process filter chain as a nice improvement that (in my opinion) has too many upsides and far too few downsides :)

@bartkamphorst, what do you think about that? I see the downside you pointed out, but also the upside (filters can generate content for other filters) that @SkyCrawl mentioned. Do you think the possibility to escape gollum tags is sufficient?

SkyCrawl commented 9 years ago

So I would say (probably superfluously) keep the test suite as it is where possible, with simplification as a future project.

That's what I intend to do, of course :). I won't make big changes to test code without your professional assistance... Perhaps there was a misunderstanding, though - I'm not sure about anything yet but one of my main goals is to improve Gollum's design... That means (I think) that some tests (or some of their code) will be obsolete because Gollum's design will SAFELY handle these cases by itself :). Sometimes that may even be very desirable because it can then display thoughtful error messages or warnings to users.

Still, I'd like to encourage you to try and build seperate PRs where possible ... and in general I think it's best to have manageable chunks, rather than a big changeset that keeps growing

Good idea... but I wish it were that easy :(. I have a feeling what you're suggesting would actually add work for all of us, and most of all, for me :(...

I could of course make several PRs with some of the smallest changes I've already done but otherwise I would have to be limited to small PRs that would each leave Gollum in a non-functional state... I think that is not desirable?

What's worse, however - even I don't see that far into the future and sometimes happen to be proven wrong of implementing a particular feature or design... Unless I absolutely have to, I'd prefer to make PRs only when I'm sure of their quality and in, more or less, a working state. That brings us to the beginning again... big changes, big chunks :(. Almost all of it is linked together somehow and frankly, I believe that you'll be able to truly appreciate it once you see it all come together, more or less ready...

If I can, I'd like to ask for patience and in return, I promise that I'll make the gollum-lib PR as soon as I can (even long before finishing my changes to the frontend) and that I'll make it as painless for you as I'll be able to. When I do, we could arrange a time to look at it collectively, review the changes, point out any missteps or further improvements, as well as my other future plans with Gollum and why I will have done all this in the first place. I really want you to have the big picture in mind, when I make the PR and I want to provide real examples for my arguments :). After we review all of it and, perhaps, agree on some additional tweaks, I'll be happy to try and break up the big PR into smaller chunks that you could process in copious details individually. Does that sound agreeable?

Like @sunny mentioned: let's build a shiny new motor first? It is probably at most a month or two away from being finished, if I had the time to continue at the current pace :). But it's not like this is a commercial project and if I happen to finish it at the end of the year, is it a big deal? I think we can always merge the "branches" in future and your conservative, "small steps" development of Gollum actually works for me too :).

also because you mentioned you might run out of time to work on gollum soon

which we might never get around finishing

Don't worry, I'll finish what I started :). I want to see Gollum shine way more than it does now :+1: . Even if it takes years.

dometto commented 8 years ago

Closing this for now to prevent the issue tracker from getting cluttered. I just noticed multiple duplicates had arisen, which complicates things for everyone trying to squash bugs. I suggest in the future we use PRs or issues in gollum-lib to discuss these matters.

gollum / gollum

New, unified and future-proof tag syntax for Gollum #993

The problems

The solutions

SUMMARY

"Single process" filter chain

Separation of tag filters

Unification of filters and macros