phpDocumentor / guides

Guides library to parse documentation
MIT License
30 stars 15 forks source link

Improve Buffer class #239

Open jaapio opened 1 year ago

jaapio commented 1 year ago

Right now we have a number of locations using the buffer class to collect lines from the documents to parse them as blocks. These blocks are always indented. When consuming this buffer we need to remove the initial indentation of the block to be able to process the lines are normal lines.

We want the buffer to be responsible for this un-indenting part. To centralize the way we are doing this. The complexity here is that the indentation of a block may defer per situation. and can be something between 2 or PHP_MAX_INT spaces.

Locations where to do this improvement:

This method removes the indenting. Based on the first line of a block. \phpDocumentor\Guides\RestructuredText\Parser\Productions\BlockQuoteRule::normalizeLines

Here I do something simular for lists
packages/guides-restructured-text/src/RestructuredText/Parser/Productions/ListRule.php:107
packages/guides-restructured-text/src/RestructuredText/Parser/Productions/EnumeratedListRule.php:109
Definition lists: packages/guides-restructured-text/src/RestructuredText/Parser/Productions/DefinitionListRule.php:99

See https://github.com/phpDocumentor/guides/pull/225 for the original discussion.

greg0ire commented 1 year ago

So if I understand well, you would like getLines() to invert the current situation:

Would that work for you? Oh and also, I think there is a need for detecting the ident, right? So that the caller does not have to provide it?

jaapio commented 1 year ago

The detection of indentation might be hard to wrap in a method as it is depending on the context where you are in the parser.

But what you see a lot in RST is that indentation marks the beginning of a block. So when the parser detects a line starting with a whitespace it will start collecting a block. Which ends when the next line is less Indented.

The content of each block is passed to a separate parser pipeline to make it easier to work with, we decided to remove the initial indentation of a block before passing it to the new pipeline. So for example a listitem will always start with a dash. If we would keep the indentation of a block. The list item detection should be aware of this indentation. Right now it doesn't have to know about it.

So yes.. remove indentation would be the default. Unless we want to do something special, then we would need the raw content.

greg0ire commented 1 year ago

Okay… I have started working on something, please let me know if you think it is the right direction: https://github.com/phpDocumentor/guides/pull/442

I have managed to reduce code in BlockQuoteRule and DirectiveRule thanks to it, but for the list rule, it is not so easy, I'm not 100% sure I can leverage what I did in the buffer class for that particular piece of code. I think I'm a bit confused as to whether you're expecting me to leverage it for deindenting the list globally, or rather to deindent each item, or both.