Closed James-Yu closed 1 year ago
Can you add some tests to this?
Here it goes!
Thanks! I am away for the next week, but I will look at this when I get back.
HI @siefkenj, I am helping @James-Yu for the maintenance of LaTeX-Workshop and I also am the maintainer of https://github.com/jlelong/vscode-latex-basics which provides built-in LaTeX syntax highlighting for VS Code.
I think the following two commands provided by the minted
package should also be treated as \lstinline
: \mint
and \mintinline
. They accept the following syntax \mintinline[⟨options⟩]{⟨language⟩}⟨delim⟩⟨code⟩⟨delim⟩
. The delimiters can be a single repeated character, just like for \verb
. They can also be a pair of curly braces {}
.
We may also consider supporting the pythontex
package for an other PR, see the discussion https://github.com/James-Yu/LaTeX-Workshop/issues/2542#issuecomment-786692920
@jlelong Here it goes with support to \lstinline
, \mintinline
, and \mint
😄
These macros are pretty sophisticated. I think we need to take a different approach, since the current one doesn't do any parsing of the optional arguments. Looking at the code, it appears that things like comments inside the optional arguments will be parsed as strings, among other incorrect things, like \lstinline[foo={]}]!...!
not being parsed correctly. Also, the PEG.js grammar currently doesn't return any nodes of type argument
; that is all left up to the ctan/packages.
Here's my proposal:
one_square_bracket_args
rule that is similar to verbatim_option
but matches token
instead of .
. This rule wouldn't produce any group, just return an array with with [{type: "string", content: "["}, ..., {type: "string", content: "]"}
verbatim_group
rule similar to the existing one, but make it return a group with a single string of content.verbatim_delimited_by_char
rule that returns a flat array of three {type: "string",... }
objects
With those parsing rules, we can add the special exceptions for the \lstinline
and friends. Then, add a ctan
package that parses the arguments as usual. Since the grammar should have already parsed everything that needs to be verbatim as strings, everything should work out :-)
- Make a
one_square_bracket_args
rule that is similar toverbatim_option
but matchestoken
instead of.
. This rule wouldn't produce any group, just return an array with with[{type: "string", content: "["}, ..., {type: "string", content: "]"}
- Make a
verbatim_group
rule similar to the existing one, but make it return a group with a single string of content.- Make a
verbatim_delimited_by_char
rule that returns a flat array of three{type: "string",... }
objects With those parsing rules, we can add the special exceptions for the\lstinline
and friends. Then, add actan
package that parses the arguments as usual. Since the grammar should have already parsed everything that needs to be verbatim as strings, everything should work out :-)
Quite challenging to me! Will follow the instructions shortly.
I'm having a difficulty in the step "add special exception for the \lstinline
and friends". It seems not possible to return an expanded array or backtracing in PEG grammar. My current rule for \lstinline
is
verbatim_listings "verbatim_listings"
= escape
macro:"lstinline"
option:square_bracket_argument?
verbatim:(verbatim_group / verbatim_delimited_by_char) {
return [createNode("macro", { content: macro }), ...option, ...verbatim]
}
This is generating (for \lstinline[t]#code$#
)
{
type: 'root',
content: [
[
{
type: 'macro',
content: 'lstinline'
},
{
type: 'string',
content: '['
},
{
type: 'string',
content: 't'
},
{
type: 'string',
content: ']'
},
{
type: 'string',
content: '#'
},
{
type: 'string',
content: 'code$'
},
{
type: 'string',
content: '#'
}
]
]
}
which is obviously wrong as the content for root is an array of array of nodes. Here the inner array seems not flattened.
Do you have a suggestion on how to handle this issue? Or was I taking a wrong route?
This is good progress! Try adding a flatMap
to the result of token*
from root. (Actually any place that looks for token
will need to handle the case of an array now, since token
is no longer a single token)
This is starting to look pretty good :-). Can you fix the broken tests and also add some tests of weird stuff, like
` \lstinline[foo %bar
]{my code} ` and make sure that the comment parses as a comment and the newline is interpreted as a parskip.
This PR also needs the companion parts in ctan
to attach arguments correctly.
Figured out how to use existing argument parsers! More tests pending.
In the mean time, I don't think either d${d}${d}
or m
alone can cover both the #code#
and {code}
case, tested. Therefore, I still use a custom argument parser for these verbatim macros.
Done!
Thanks for this! I will release a new version :-)
So many thanks! I am so glad this contribution get accepted!
This PR resolves #35
This PR adds the
\lstinline
macro fromlistings
package to PEG.js as a verbatim macro. Following the discussion in https://github.com/siefkenj/unified-latex/issues/35#issuecomment-1585037810, this is the only way that preventsunified-latex
from parsing contents in the macro, which should beverbatim
.