CDSoft / pp

PP - Generic preprocessor (with pandoc in mind) - macros, literate programming, diagrams, scripts...
http://cdelord.fr/pp
GNU General Public License v3.0
252 stars 21 forks source link

Latex \dot vs Built-in Macro \dot #40

Closed robinrosenstock closed 6 years ago

robinrosenstock commented 6 years ago

In my markdown I use raw Inline Latex command, like this: \(\dot V\). I guess this is the pandocs markdown extension "tex_math_single_backslash". And now when using pp I get an error message, because pp means the \dot is a built-in macro. This is the error message:

pp: Arity error: dot expects 2 or 3 arguments
CallStack (from HasCallStack):
  error, called at src/ErrorMessages.hs:49:27 in pp-1.12-JZaGJL3q7GE2Y6zH7xiUqv:ErrorMessages

How to overcome this issue? Can you disable the "\" notation and only use "!" notation?

robinrosenstock commented 6 years ago

For the time being, I have removed the dot functionality, that is I've removed "Dot" in variable GraphvizDiagram in Formats.hs - Line 54. Maybe there is an other alternative instead of hardcoding macros?

CDSoft commented 6 years ago

This characters could be configurable (with a macro and/or on the command line).

$ pp -macrochars "!"
!macrochars(!)

Another solution may be to use \raw around LaTeX commands.

tajmone commented 6 years ago

I like the ideas of being able to customize the macro chars. In some situations it could be very helpful.

Would this be a single-char-length symbols setting only, or could it allow using double chars as symbols. Eg:

$ pp -macrochars "!! \\"

... defining !! and \\ as macrochars (the space separates definitions).

As for the !macrochars() macro, would it become effective from the point of its definitions onward? Ie, one could restore the macrochars to the defaults later on via another !macrochars(!\), or change them once more to something else.

Isn't there a risk of nested macros behaving erraticly after changing the macrochars during the document flow? How would the nested macros definition behave? Or would the macrochars override only apply to the current context?

The command line options should affect the macrochars globally in the document (ie: before the document is even parsed); but the inline macro definition is a different story alltogether.

CDSoft commented 6 years ago

This would be a single char (any char in the string would be a valid char to start a macro call, !macrochars(!\) would enable two possible syntax: !macro and \macro. More than one char would be too heavy and less than one too ambiguous. Then it's up to the user not to do a wrong usage of this macro.

The new chars will be used after calling macrochars until the end of all documents or the next call to macrochars. The idea is to call this macro as early as possible (on the command line, in a file imported on the command line (with -import), in a common included file, at the beginning of the file...).

robinrosenstock commented 6 years ago

@CDSoft your solution with !macrochars is well enough. But I can't use !raw, because I have much more Latex Math and probably other things as well that doesn't work well with the \ as a macro character. I like ! much better, and it would be better to only use the exclamation mark (my opinion).

tajmone commented 6 years ago

it would be better to only use the exclamation mark (my opinion).

Are you suggesting dropping support for the \?

I also tend to use more the !, but often I alternate ! and \ in complex macros as a visual reminder of the nesting level (and use different bracketing also), but this is just an aesthetic need — I could survive without it.

I agree that the \ syntax has greater potential for conflicts (even within verabtim- and code-blocks, where some overlooked unlucky chars combination could end up being mistaken by PP for a macro). In some rare edge cases, the \ might even clash with pandoc markdown, where the \ can be used for escaping (all_symbols_escapable extension):

Except inside a code block or inline code, any punctuation or space character preceded by a backslash will be treated literally, even if it would normally indicate formatting. [...]

This rule is easier to remember than standard Markdown's rule, which allows only the following characters to be backslash-escaped:

\`*_{}[]()>#+-.!

(so far, I never incurred into a markdown-escape/pp-macro conflict)

But I do think that having more than one syntax choice is good; and since removing altogether the \ would break backward compatibility, it should only be done if the conflicts are common enough to justify removing it as a default syntax, otherwise offering a way to override it should be preferable.

I admit that I use PP almost exclusively to work with pandoc markdown, and I haven't encountered many problems with the \ usage so far. But this might not be the case with other users (as this issue demonstrates).

Again, I really think that the idea of being able to define/override macrochars via CLI options, or in-text via a macro, is good, and I fully support it. Since from the introduction of this feature onward, users will be able to control the macrochars, it would be the right time to consider wether the current !\ chars are good, or if they cause enough conflicts in some contexts that they should be reconsiderd (like @geniusupgrader suggested). If a backward breaking change has to be introduced, the sooner the better (especially since PP users base is starting to grow faster).

robinrosenstock commented 6 years ago

Are you suggesting dropping support for the \?

I think some conflicts between PP and pandoc's markdown do not justify the removal of \. Implementing !macrochars should be enough. And then for me personally, I will only use !.

CDSoft commented 6 years ago

The use of both \ and ! comes from different preprocessors I used and I kept it for backward compatibility but it may be time to simplify this.

We can have a default configuration with a single char and macros to change this behaviour.

Macro calls: !macro Literate programming macros: @macro

But I would prefer keeping the three kinds of parenthesis for parameters: (), [] and {}. It helps grouping parameters in nested macro calls.

CDSoft commented 6 years ago

pp 2.0 implements these macros and uses "!" to call macros. !macrochars(!\) can be used in an imported file to restore the previous behaviour. Hope it works.

bpj commented 6 years ago

But I would prefer keeping the three kinds of parenthesis for parameters: (), [] and {}. It helps grouping parameters in nested macro calls.

Actually I have been bitten from time to time by the fact that \macro{} may conflict with embedded LaTeX, so I would like to be able to

I wouldn't mind having a set of macros

!add_macrochars(&?%...)
!rem_macrochars(\...)
!add_delimiters(<>...)
!rem_delimiters[{}()...]

which all should take an open number of character(s) (pairs) in their argument so that you can add/remove more than one (pair of) characters in one go.

The characters in the arguments should preferably be allowed to be any characters with Unicode General Category P or S, expecting users to be smart enough to not shoot themselves in the foot with their character choices.

tajmone commented 6 years ago

so I would like to be able to

part of the sentence was lost!

The ideas of a macro to also define delimiters is good — and backward compatiblity can always be reintroduced via such macros.

I think that by default PP should have at least two type of delimiters, so even if you drop the curly braces ("{ }") there will still be the square brackets ("[ ]") and parenthesis ("( )"). But definitely, alternating delimiters make long single-line nested macros easier to read, understand, edit and debug; and at least two alternative delimiters should be built-in!

I noticed that in your example there are !rem_* and !add_* variants of these macros with parameters. How would they work, they would remove/add specific delimiters without affecting the remaining ones? ie: !macrochars (and !macrodelimiters?) would reset the accepted chars/delmiters to the ones in the passed param only, while the !rem_* and !add_* variants will allow removing or adding without affecting the rest?

bpj commented 6 years ago

so I would like to be able to

part of the sentence was lost!

"I would like to be able to redefine the set of delimiters as well."

I noticed that in your example there are !rem* and !add variants of these macros with parameters. How would they work, they would remove/add specific delimiters without affecting the remaining ones? ie: !macrochars (and !macrodelimiters?) would reset the accepted chars/delmiters to the ones in the passed param only, while the !rem_ and !add_* variants will allow removing or adding without affecting the rest?

Exactly.

tajmone commented 6 years ago

pp 2.0 implements these macros and uses "!" to call macros. !macrochars(!) can be used in an imported file to restore the previous behaviour. Hope it works.

As soon as I got notice of the v2.0 release, I decided to update/check all the defintions in my “The Pandoc-Goodies PP-Macros Library” (they were lagging behind, and some stopped working after the v1.11fix):

Updating included extensive testing via the (pre-existing) test suite. I didn't encounter any problems, and it seems to work finely.

The only difference I noticed is that some macrso that previously managed to create and then delete temporary files via !exec now seem unable to delete them. I don't know why they dont' get deleted anymore, but my guess is that it's just a problem with the file being still used by the previously invoked command/tool — See my Issue #42 ("Add !execwait Macro") on this regard.

tajmone commented 6 years ago

How does !add_delimiters(<>(){}) decide which are chars pairs? Does it assume that each odd char is the opening delimiter, and its following (even) char the matching closing delimiter; creating a pair from every 2 contiguos chars?

Does it mean that !add_delimiters(][) would result in flipped square brackets delimiters? ie: !macro]pararm[

Will the macro accept only an even number of chars as parameter, and fail on finding duplicate chars in the param? eg: !add_delimiters([][})

bpj commented 6 years ago

Den 2017-10-26 kl. 00:00, skrev Tristano Ajmone:

How does !add_delimiters(<>(){}) decide which are chars pairs? Does it assume that each odd char is the opening delimiter, and its following (even) char the matching closing delimiter; creating a pair from every 2 contiguos chars?

Exactly.

Does it mean that !add_delimiters(][) would result in flipped square brackets delimiters? ie: !macro]pararm[

Yes. As I said one should trust the user not to shoot sthemself in the foot.

Will the macro accept only an even number of chars as parameter, and fail on finding duplicate chars in the param? eg: !add_delimiters([][})

That's the idea. The chars in every pair must be differtent fromeach other and unique among all defined delimiter/macro chars.

I've written a parser or two in my day which had to deal with arbitrary multichar delimiters. They complicate things. There are situations where they are well motivated -- particularly if you are confined to ASCII and need more than four delimiter pairs, or when you need to allow substrings of the delimiter in the delimited text -- but I don't think this is one of them. There are enough usable characters in Unicode to get by!

tajmone commented 6 years ago

I really like this!

I've written a parser or two in my day which had to deal with arbitrary multichar delimiters. They complicate things.

I've faced similar complications when writing a language definition for a syntax highlighter: I encountered a language that shared some common chars in different strings delimiters ("..." literal strings, and ~"..." escape strings), and it soon became a nightmare implementing states to track strings and quotes escape sequences (\") within literal and escapable strings. So I'm glad you're going to enforce unique delimiter chars.

bpj commented 6 years ago

Well I'm not in a position to enforce anything since I don't know Haskell and thus can't make a PR. I'm merely suggesting.

CDSoft commented 6 years ago

Is it really useful to have add_* and rem_* macros? A single macro to change the whole char set should be enough:

!macroargs( () «» ) <-- notice that spaces are ignored, there must be an even number of non space chars
!foo(x) !foo«y»
!foo[z] <-- won't work here!

Currently pp does not support unicode. It should be a separate issue if really required because it would require a lot of changes (e.g. changing String type to Text).

tajmone commented 6 years ago

Is it really useful to have add* and rem* macros?

only in projects that import macro definitions-modules from different sources — where a macro might need to introduce changes that won't disrupt the general context.

I'm not sure if this is a realistic scenario right now. Maybe, one day there will exist hundered of independent macros library, for users to import. When this happens, this macro would allow maintaining macros module updated with newer PP versions, or allow adjusting macro modules to work with specific PP versions.

But my gues is that, right now, authors are personally managing their macros (no matter how many files). So, if it involves lots of work it could be just added to the wishlist of future enhancements.

CDSoft commented 6 years ago

I have added !macroargs. I think it would be dangerous to let macro change the parser anywhere. pp uses a one pass parser. Nested macro calls defined before changing the parser configuration will fail when executed. These macros are intended to be used at the very beginning (ideally in an imported file on the command line).

robinrosenstock commented 6 years ago

Well, one week passed by, time to close this, because my problem was solved and your implementation seems working.