CDSoft / pp

PP - Generic preprocessor (with pandoc in mind) - macros, literate programming, diagrams, scripts...
http://cdelord.fr/pp
GNU General Public License v3.0
253 stars 21 forks source link

!env() broken in PP 1.3 for Windows #11

Closed tajmone closed 7 years ago

tajmone commented 7 years ago

After the update to v1.3 !env() stopped working in many cases. Often, when it is invoked (eg: !env(Path)) it reports:

pp: Arity error: cmd expects 1 argument
CallStack (from HasCallStack):
  error, called at src\ErrorMessages.hs:40:27 in main:ErrorMessages

(I've tested it with scripts that previously worked with v1.2)

It's not the letter-casing bug creeping back in (#5), it produces always the error, regardless of casing.

I've tested it also with custom defined environment variables, It doesn't always raise this error, it depends on the contents of the variable. Could it be a problem with handling strings with spaces or other special characters (like you find on paths)?

Or maybe it's because the %PATH% is too long? (but it used to work before)

tajmone commented 7 years ago

Further testing revealed an strange thing: I only get this error when I set my custom variable to this value:

SET PP_MACROS_PATH=C:\MY_PATH\pp-macros\

... slight variations to this path will not create the error. I don't understand what's wrong with this string. Somehow it tilts PP when it tries to emit it.

And I also got the error with !env(Path) because the same path is part of it.

Any idea why?

tajmone commented 7 years ago

Specifically, it looks like it's the \pp-macros\ part, because if I rename it to \ppmacros\ it doesn't raise the error.

Could it be that PP sees \pp- as the pp command itself?

tajmone commented 7 years ago

UNDERSTOOD THE PROBLEM...

Whenever there is a \pp (not followed by a word character) in the variable (usually because of path), PP thinks it's facing a \pp() macro!

I didn't remember the existence of this macro so I didn't think of it untill I started to work out the problem by exclusion.

I can't think of a solution around this because I'm trying to use the env var as an include path for a pp macro:

!inc(\env[PP_MACROS_PATH]Highlight.pp)

... so using !raw() is not a solution.

This seems like it's going to create potential problems with Windows paths because of the backslash path separator being also the pp macros symbol \ ... Chances that a path segments will have the same name of a custom macro are not unlikely.

Maybe the !inc() and !env() macros should have equivalent raw built in macros, capable of handling paths without interpreting the backslash as a macro, and still work nested into each other.

I'll try to experiment with intermediate macros and !raw() to see if I find a creative solution in the meantime.

CDSoft commented 7 years ago

More generally I don't know which solution is the better:

  1. everything (parameters, results) are preprocessed and we can use \raw to avoid preprocessing some parts (which is sometimes tricky)
  2. parameters and results are preprocessed when \pp is explicitely used

I think a better solution would be to preprocess parameters only. This way, \inc(\env(...)) would call inc which will evaluate \env. The result of env never gets preprocessed unless we write \inc(\pp(\env(...))).

This may touch most of the macros and I need more time to test it. I will make a new branch for this that will be merged when it works.

Maybe incshould be an exception because most of the time, it's output is meant to be preprocessed (we should have inc and rawinc like in version 1.3).

tajmone commented 7 years ago

It's a very tricky question indeed because of the endless combinatory possibilities. It's in the nature of macros to be simple yet provide power by the way they are nested and chained. Indeed any changes in behavior might lead to unexpected consequense.

Currently I've solved the problem by adopting a different approach: I used the env var in the batch file calling pp, and kept all macros to be included in the same folder of the pp macro invoked from batch --- so there is no need to add !env() inside !inc() since PP will look by default in the folder of the current macro, as well as in the folder of the calling macro. Usually, finding solutions of this type (ie: restructuring the project folders and files) should be the best solution to accomodate macros "limitations" (something which is typical to all macros systems).

Solution (2): currently the presence of a \pp (RegEx: \bpp\b, to be accurate) is interprete as a \pp macro. What I find fuzzy to understand is when pp considers a macro as explicit or implicit.

As for solution (1), I couldn't solve the problem with !raw because it would prevent !env() from emitting anything at all:

!inc(\raw{\env[PP_MACROS_PATH]}Highlight.pp)

Somehow the problem has to do with the levels of preprocessing (my wording might not be very precise, sorry): if !env() were to emit the variable without further processing it would be fine -- but then again, it would be !inc() that still processes it and expand any macros in the path?

These are edge cases, so the best solution would be to have alternative macros for such cases, instead of affecting the general behavior. Deinitely having both !inc and !rawinc would be great.

But the problem at hand here really only applies to file paths under Windows --- in all other case, the user would have control over macros contents, but working with files imposes the limitation of the folders structure, and often these paths might come from environment variables which are beyond the user control (eg: the current path of the invoking batch, which might vary with each user, depending where the project was saved).

Therefore it might be safer to focus on cases and macros dealing with file operations, and somehow have some means of controlling that backslashes are not mistaken for macros.

(A) Maybe there could be an alternative version for !env() and !inc() that do not consider \ as a macro prefix when expanding their results, but only ! (! being an invalid filename character under windows).

(B) Or there could be a macro that enables/disables the ! symbol, so all backslashes will not account as macros. Eg: !backslash-off() / !backslash-on()

(C) Or mabye, a builtin macro to define the macro's symbol(s) !macrosymbols(SYMBOL1)[(SYMBOL2)] could change the list of valid macro prefix symbols --- even using $ or whatever the user needs at the moment. And a !macrosymbolsrestore() could restore the defaults. This should apply to the current context only, so if the macro symbol was changed within a macro, when the macro exits those changes should be no longer valid (like local variables within a function being destroyed when the fun returns).

These are only ideas and guesses, because I don't actually understand how these macros actually work code-wise (the Haskell monadic model is beyond my reach), but they reflect the way I though about the problems I faced when working.

Hopefully, if not today maybe tomorrow I shall be publishing the first working alpha of my pp macros public library collection, which would then provide some real case uses and a testing ground. I've already created the empty repository, but I need to clean up some code, make sure all third party licenses are in order and then I can push some macros which are ready for use.

CDSoft commented 7 years ago

In fact, only env and inc preprocess their outputs. I will modify env so that it does not preprocess its output and leave inc and rawinc unchanged.

Thus, both examples will work:

By the way, Windows support '\' and '/' as path separators. So \inc(c:/pp-xyz) also works.

I would like to keep \ as a macro char because it provides a LaTeX like syntax.

Defining the macro chars dynamically will slow down the execution of pp. Because of the immutability of functional programming, it would require to search in a list before parsing any characters. Unless I find an efficient way to do it I will leave this list statically defined.

tajmone commented 7 years ago

Thanks a lot, this looks like a good solution.

By the way, Windows support '\' and '/' as path separators. So \inc(c:/pp-xyz) also works.

True, but when retriving env vars like %PATH% or %~dp0 the CMD will output them with the standard \ --- which is the problem I banged against. There are CMD commands to find and replace chars in strings which could be used to substitute the separator, but like most things Windows they are not so friendly.

CDSoft commented 7 years ago

I know that windows is not programmer friendly ;-) I have abandoned windows since 1997 even if my previous employer and customers forced me to use it...