CDSoft / pp

PP - Generic preprocessor (with pandoc in mind) - macros, literate programming, diagrams, scripts...
http://cdelord.fr/pp
GNU General Public License v3.0
253 stars 21 forks source link

Add Support for Weighed Tagged-Regions Include Macro #69

Open tajmone opened 5 years ago

tajmone commented 5 years ago

Ciao Christophe,

Although a similar proposal was already briefly discussed at #24 (and dismissed), I would like to propose again inclusion of partial files, but this time focusing on tagged regions only and bringing in as an example a tool that I've created that leverages tagged regions — hopefully this might provide a good usage example to motivate adding this feature natively to PP.

I propose that PP should add a variant of the !include macro that supports Asciidoctor-style tagged-regions:

This is a really cool Asciidoctor feature which allows to delimit regions of code/text in any language/markup source file that supports inline comments (including AsciiDoc files).

Give a somefile.rb file:

# tag::timings[]   
if timings
  timings.record :read
  timings.start :parse
# end::timings[]

Then any AsciiDoc file could selectively include that region at processing time via:

include::somefile.rb[tag=timings]

This is much better than using line numbers ranges, for line numbers change when the code is edited, while regions tags are fixed and always reliable reference points.

If PP could allow a similar feature it would really boost productivity, as it could be used not only for source code but also with markdown documents.

AsciiDoctor natively supports all sorts of comment delimiters, including XML <!-- --> which can be used in markdown docs.

It also supports some basic wildcarding, which simplifies including multiple regions:

Use Case Example: Doxter

I've created a small tool which exploits tagged-regions to generate documentation from AsciiDoc comments in source code, leveraging tagged regions:

Although simple in nature, it's powerful because it introduces a simple notation to add weights and subweights to regions and regions-fragments, thus allowing reordering of contents at extraction time — i.e. fragments of a same regions are sorted according to subweight before being merged into a single AsciiDoc tagged region, and regions are sorted according to weight. This is a desirable feature in documentation from code, because it allows to keep the text next to the code it belongs to and at the same time produce a document in which the text is presented in a meaningful order.

In Doxter notation, this:

-->MyRegion(20.6)
--| = A Title
--|
--| Some AsciiDoc formatted text...
--<

becomes this in the final AsciiDoc extracted document:

// tag::MyRegion[]
= A Title

Some AsciiDoc formatted text...
// end::MyRegion[]

... which can then be included by other AsciiDoc docs. The whole idea is that each source file in a project should produce a standalone documents which is readable on its own but is also chunked into regions "behind the scenes", so that parts of it can then be selectively included by a main document/book that stitches them together (along with imported code from examples files) adding more detailed textual explanation. This allows to create a whole book by mixing manually maintained text and text/code snippets autogenerated/extacted from the source files (as well as from code examples). None of this would be possible without tagged regions — line ranges would require costantly updating the line numbers in every include:: directive, which is impractical in big project.

PP Implementation Idea

If PP were to support tagged regions, it would basically empower pandoc with the same functionality found in Asciidoctor, thus making PP + pandoc two powerful tools for documentation generation.

In Issue #24 you argued that:

There already exist many standard tools to deal with file, there is no need to reimplement them in pp.

It's true, but adding this feature natively to PP would make it independent from third party tools and platform dependency.

If Doxter's usage of tagged regions for sorting contents might inspire a novel approach to using (or creating) a PP tag-region delimiting macro, so much the better!

I think that this feature would require that PP should at least offer two macros:

Example of the former:

!tag(RegionX)
~~~~~~~~~~~
Some markdown, or code (whatever)...
... etc ...
~~~~~~~~~~~

... producing:

<!-- tag::RegionX -->
Some markdown, or code (whatever)...
... etc ...
<!-- end::RegionX -->

And the latter:

!includetag(somefile.md)(RegionX,AnotherRegion,Etc)

... allowing multiple tags.

Weight/Subweight Sorting?

Maybe optional weight and subweight parameter could also be supported (with Doxter usage in mind), allowing pre-sorting of regions at include time (only work for single !includetag macros) — !tag(RegionX)([weight])([subweight]):

!tag(RegionTwo)(2)(2)
~~~~~~~~~~~
Paragraph 2.1.
~~~~~~~~~~~

...

!tag(RegionOne)(1)
~~~~~~~~~~~~~~~~~~~~~~
# Sect 1

Paragraph 1.1.
~~~~~~~~~~~~~~~~~~~~~~

...

!tag(RegionTwo)(2)(1)
~~~~~~~~~~~
# Sect Title 2

~~~~~~~~~~~

... where a !includetag(somefile.md)(RegionOne,RegionTwo) macro with sorting functionality could produce:

<!-- tag::RegionOne -->
# Sect 1

Paragraph 1.1.
<!-- end::RegionOne -->

<!-- tag::RegionTwo -->
# Sect Title 2

Paragraph 2.1.
<!-- end::RegionTwo -->

but maybe this is beyond the tagged regions scope (it isn't part of the original AsciiDoctor implementation but, as Doxter shows, it offers great potential for documentation from code generation).

CDSoft commented 9 months ago

Sorry for the late reply... Please keep in mind that pp is not supported anymore, it's hard to deploy. For new projects I suggest ypp which is is based on a Lua interpreter and way easier to compile and install and binaries are easier to produce (thanks to zig) and deploy (see hey).

The include macro of ypp can extract regions of a file (from line numbers or using Lua patterns).