CDSoft / pp

PP - Generic preprocessor (with pandoc in mind) - macros, literate programming, diagrams, scripts...
http://cdelord.fr/pp
GNU General Public License v3.0
253 stars 21 forks source link

Accessing Pandoc Header / YAML Variables #9

Closed tajmone closed 7 years ago

tajmone commented 7 years ago

Feature request: Add to PP awarenes of Pandoc variables defined with pandoc_title_block and/or yaml_metadata_block extensions.

Rationale: this would allow granular control of conditional PP macros through pandoc variables set on a per file basis (either in the file's pandoc or YAML header blocks) or globally on project by including YAML files on a per folder basis (ie: common variables are stored in a YAML file in the project's root, and in each folder, allowing overrides of values by concatenation).

This solution integrates better with pandoc, instead of setting variables via command line parameter, and also allows PP to share pandoc's template variables — PP could even conditionally change some variables before invoking pandoc.

It would be particularly useful in projects relying on automated scripting.

CDSoft commented 7 years ago

Hi, The problem is that it would require a new parser. The current parser is very basic (it knows about macros only). IMO using macros to fill in these blocks would keep pp simpler. You can have a "configuration" file that defines macros. But this "solution" also doesn't work in the case of pandoc type blocks (empty lines may be inserted before the title block). Unless you include the file from within the block:

% \include(conf.i)
  \title
% \author

where conf.i defines title, author, ...

A quick workaround would be the be able to load and parse a file and rejecting its output (just keeping side effects in the current environment).

two ways:

tajmone commented 7 years ago

A new parser would be too much work and not worth the prize.

Right now I am experimenting a lot with ways of reusing "configuration files" in a PP->pandoc toolchain, and starting to create macros collections that I share with all projects — I'll publish them when they are polished and ready to go; but I'd like to make them cross-platform and still need to add the required conditionals. It would be nice to see a users-contributed library of PP macros for different tasks.

Mainly, I'm trying to overcome some markdown limitations. For example, I use Asciidoctor for tables via a custom PP macro than invokes Asciidoctor on an external file and then pass on the raw html to pandoc (that is in html workflow). This gives me nice tables with columns and rows spanning, and fine control over aspects ratio. I've used PP to invoke an external syntax highlighter (Highlight) which has lang definitions not present in pandoc, offers line numbering and Lua plugins extensibility; plus I can use any other syntax highlighter as well (Pygments, etc.) and mix them freely in a single doc.

Usually the automation workflow that I've found to work best is to have for each a which is pandoc md + pp macros, plus a which will be passed to pandoc for template variables.

Possibly, finding a ready-made tool capable of parsing YAML file and add its vars to the environment context would be a good solution: it could be invoked before PP, on , and make all vars available to PP. I'll do some research on it, I'm confident there is some tool of this sort out there (hopefully cross-platform).

A YAML-based solution could dispense from using pandoc header blocks (and keep things simpler).

I'm not sure I follow you in the workaround you proposed. Which kind of configuration format would that be? You mean PP macros defining environment vars?

tajmone commented 7 years ago

After some research I've found this command-line, cross-platform and single-binary tool to read environment vars from YAML files:

https://github.com/EngineerBetter/yml2env

It's usage is very basic:

yml2env <YAML file> <command>

... the <command> is mandatory (it won't just load vars in the environment and return to the shell), so <command> would have to be PP. Effectively, this allows a workflow of this type:

and 2-steps automation in the form:

yml2env file.yaml   pp file.ppmd >file.md
pandoc -f markdown+yaml_metadata_block  -o file.html    file.md  file.yaml 

This tool will read only from YAML headers, and ignore anything else, so it will work also with pandoc markdown documents using yaml_metadata_block extension, beside separate YAML files. Therefore, if YAML vars are kept inside the document, there is no need for the intermediary step:

yml2env file.ppmd pp file.ppmd | pandoc -f markdown+yaml_metadata_block   -o file.html

I think it's a good solution, and with a bit of imagination it could be used in more complex ways, to handle hierarchical configuration files overriding according to folders structure, and so on ...

CDSoft commented 7 years ago

As long as there are already existing solutions I prefer keeping pp simple. pp can parse markdown file or separate parts of file without knowing the structure of the final document. It can be any component in a pipe chain, not necessarily the first one and may not be aware of any header of the final file. I propose to close this issue.