CDSoft / pp

PP - Generic preprocessor (with pandoc in mind) - macros, literate programming, diagrams, scripts...
http://cdelord.fr/pp
GNU General Public License v3.0
252 stars 21 forks source link

Intricate Interactions With External Tools #31

Closed tajmone closed 6 years ago

tajmone commented 7 years ago

( EDITED: Fixed links to point to specific commit in master branch because dev branch was merged-in and deleted )

Today I've struggled quite hard to create a macro tha takes a block of code and passes it to an external source highlighting tool and then emits the final raw html back into the doc. Maybe something can be done to make such cases easier...

Here is the final macro (still in a dev branch):

and here is some test code to see how it actually works:

The syntax of the macro is:

!raw{!Highlight(LANG)(OPTIONS)
~~~~~
CODE
~~~~~
}

... taking the following parameters:

NOTE: This macro creates and deletes a temporary file (named "_pp-tempfileX.tmp", where X is a numeric counter) in the macros folder (/pp/macros/) for each macro call in the document, to temporarily store the code to highlight. The X counter is reset at each PP invocation.

This is how I managed to create the macro:

!define(Highlight)(
!add(HLCounter)
!quiet[!lit(!env(PP_MACROS_PATH)_pp-tempfile!HLCounter.tmp)()(\3)]
!quiet[!flushlit]
<pre class="hl"><code class="\1">!exec[highlight.exe -f -S \1 --no-trailing-nl --validate-input !ifdef(2)(\2) !env(PP_MACROS_PATH)_pp-tempfile!HLCounter.tmp]</code></pre>
!ifeq[!os][windows]
[!exec(DEL !env(PP_MACROS_PATH)_pp-tempfile!HLCounter.tmp)]
[!exec(rm !env(PP_MACROS_PATH)_pp-tempfile!HLCounter.tmp)]
)

!define(HLCounter)(0)

The problem encountered was thay I couldn't find any direct way to pass to the Highlight application the CODE parameter by feeding it to it via STDIN. So I resorted to a temporary file stored in the macros folder (its path is already in an env-var because some macros need it to find CSS definitions). It's not the most elegant solution, but it works fine (the temp files are deleted by the macro itself).

As for the counter ... the problem was that there is no macro to reset a literate file, and even if I used !flushlit, at each invocation of this macro the new CODE parameter would be added to the one of the previous invocation — with each invocation of the macro the source code piled up together. This is why I resorted to a counter, so each invocation of the macro will use a different filename for the !lit macro.

It would be useful to have some macro that resets/destroys a literal file from PP's memory, to avoid situations like this one. Ideally, I would have liked the macro to use the same file at each invocation, but to forget about it after.

The file deletion works on Windows, didn't have a chance to test it on Linux yet. I'm using a simple check if the os is Windows (in which case use CMD command), else use Bash rm command to delete file — I assumed it should work for both Linux and Mac! Does it?

Was there a simpler way to accomplish this task, that I might have overlooked?

Are there any PP enhancements and new features that could make similar interactions with external apps easier, without having to rely on a temporary file?

It would be really great if there was a way to have pp call an external shell/cmd tool and be able to invoke with options and pass it some text so that the other app sees it coming from STDIN. Some piping is required, and I have no idea how easy or difficult this would be in Haskell.

bpj commented 7 years ago

So what does Highlight have which Skylighting lacks? I'm asking seriously! I've written a pandoc filter which turns code into raw LaTeX invoking minted, and optionally mdframed with a lot of metadata configurability. It only waits for its documentation to be written... :-) I'm kind of wondering if what you are trying to do would be easier with a filter.

Anyway as it happens I have been thinking of a macro which writes its last argument to a file and executes it as a script with an external program, capturing its STDOUT. Something like

!script(PROGRAM)[(--ARGUMENTS)]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
SCRIPT
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

becoming in the shell

PROGRAM --ARGUMENTS SCRIPT >stdout.tmp

Now that you say it a macro which writes its last argument to a file and passes it as STDIN to an external program would be equally great to have.

!filter(PROGRAM)[(ARGUMENTS)]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
CONTENT
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

becoming in the shell

PROGRAM --ARGUMENTS <content.tmp >stdout.tmp

BTW since you say 'OS support all' you should replace those Highlight.exe by something OS-agnostic or a macro which inserts the right version for the OS.

tajmone commented 7 years ago

So what does Highlight have which Skylighting lacks? I'm asking seriously!

For this particular project which I'm working on, which is centered around PureBASIC language, I have lot's of problems using Skylighting — even though Skylighting has a PureBASIC language definition, there are issues with it:

And I'll also need lines numbering when commenting long chuncks of code.

Furthermore, I need some custom language definition to handle commands/functions syntax definitions (ie: accepting special chars for options, like [ optional | optional ] and other mods to the basic lang definition).

So I'm looking to mix highlighting by Skylighting and Highligth in the same project.


BTW since you say 'OS support all' you should replace those Highlight.exe by something OS-agnostic or a macro which inserts the right version for the OS.

... mea culpa! ... I completely forgot!! changing it to Highlight should do the job.

FIXED, and added thanks to @bpj in the CHANGELOG :


The !script and !filter macros sound great!


As for the !lit related macros: would it make any sense to add a macro that resets/releases a FILENAME in order to start it over? or that wipes its content to empty?

I guess the intended main use of these macros is to build a single version of each code file per PP execution; and that content should usually only be added/appended to the literate FILE or macro. Still, having a macro to reset the contents could always come handy.

bpj commented 7 years ago

BTW since you say 'OS support all' you should replace those Highlight.exe by something OS-agnostic or a macro which inserts the right version for the OS.

... mea culpa! ... I completely forgot!! changing it to Highlight should do the job.

Unfortunately it is called highlight in all lowercase in Linux (I installed an old version from the Ubuntu repos and played around with it, but found no theme I relly liked as yet), so you will have to do something like !ifeq(!os)(windows)(Highlight.exe)(highlight), assuming it is not Highlight on Mac, inwhich ase it becomes slightly more complicated.

Also Pandoc has built-in line numbering, just set a .numberLines as second class on the code block.

tajmone commented 7 years ago

You're right. But in the actual macro it's in lowercase (it should be the same on all OSs') — it was just a typo in this thread, not in the actual macros.

Tonight I should find some time to boot another PC with Ubuntu, pull the project's updates, and try to run through all the macros tests and see how it goes. Unfortunately I can't test anything for macOS.

Also Pandoc has built-in line numbering, just set a .numberLines as second class on the code block.

Ah! I missed that ... Thanks for letting me know, really good to know.

Was this introduced with Skylighting? (I don't recall this option being available when pandoc relied on Kate highlighter library)

bpj commented 7 years ago

It has been around all the time, I think, but it doesn't figure very prominently in the Pandoc manual.

One feature which I miss in both Pandoc highlighters and apparently also in Highlighting is an option to have an interval in the line numbers, which is one of my reasons for going through the trouble to use minted

CDSoft commented 7 years ago

pp is not meant to be a complete shell interpreter, use a real one instead. You can use pipes or here-documents to feed external commands with some data. e.g. :

\sh
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cat <<EOF | superusefulcommand
some text
that goes to the stdin port of superusefulcommand
EOF
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

It uses sh (i.e. bash on most platforms) and sh is portable (standard on Linux/MacOS, busybox/MSYS/Cygwin/GoW/... on Windows).

Literate programming macros generate files that are considered as outputs of the script (e.g. that can be later compiled or interpreted). Their content is stored internally, they can be flushed on the disk incrementally but never deleted. If you need to generate temporary files you can use sh :

\sh
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cat <<EOF > /tmp/file1.txt
some text
that goes to the stdin port of superusefulcommand
EOF
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

do something with /tmp/file1

\sh(rm /tmp/file1)
tajmone commented 7 years ago

Thanks @CDSoft ! This is a very helpful answer and example to take on from.

Bash would be a good solution because my project assumes users have Git installed, which provides Bash even on Windows. I'd only need to make sure that my app runs smoothly when called from Git Bash.

tajmone commented 7 years ago

I've carried out some tests, and my binary application (the CMS engine) runs smoothly in Windows' Git Bash, no problems whatsoever.

So I guess that I could drop the cross-platform PP macros approach, and instead just rely on Bash scripts and commands for external interactions. This would make life much easier for me, both in terms of writing/maintaining the macros as well as in coding the app itself — I could drop from its code most of the conditional compiler statements to handle differently Windows and Linux/Mac (eg: path separator characters, shell commands, etc.)

All I really need to do is to add some startup checks in the Windows version of the CMS app to make sure it was invoked from Bash instead of CMD (or PowerShell). This is easy enough (can do it by checking some env vars).

Also, if I'll be dealing only with Bash shell, I could take advantage of some nice features like colors in text output, etc., without having to juggle multiple standards for the various OS versions.

As for the literate macros functions, it makes sense that PP contemplates its usage for a single document file at the time, without need to delete their internal representation.

On the other hand, I still think that having a builtin macro to invoke an external command/app and pass to it both options/paramters and a STDIN stream could be handy in various situations. It could be just a built-in shorthand of the example you provided above, maybe working only in Bash; but @bpj's idea of the !script and !filter macros would be really cool if it could be implement cross-platform.