Open texdraft opened 3 years ago
It would also help with searching in the PDF files. I fear that the macro (?) code would bloat the \ifacro
parts of cwebmac.tex
some more. Maybe a separate macro file like pdfwebtocfront.tex
might be an idea?
(Searching is a much better use case than copying and pasting; I can't believe it slipped my mind.)
I think there are numerous solutions. First of all, \\
and \|
can be changed only at the TeX level, without any modifications to CWEAVE
. (Changing \|
is necessary for the case of \|\_
.) However, it might be easier to have the macros deal with unescaped underscores, so CWEAVE
might be changed to output them verbatim; \\
and \|
would take care of the escaping. No special treatment would be necessary for sanitizing names, since none of the delimiting characters in PDF syntax can appear in a C identifier, so no parsing is required (as far as I know).
For custom identifiers, CWEAVE
could wrap them in a macro call to something like \CI
(for “custom identifier”) that would look like this:
\CI{\skipxTeX}{skip_TeX} % or maybe skip\_TeX
In the output you would see \skipxTeX
typeset, and skip_TeX
would be the ActualText text. An alternative would be to require users who want this feature to add something to their custom identifier macro definitions that will insert the ActualText into the PDF.
One thing that had me worried was the “granularity” of ActualText, but it turns out that you can apply it to pretty much any span of text, so it could capture an entire identifier.
Should CWEAVE
's behavior be changed, a new control code could be added that allows specifying an identifier's ActualText text, although that would probably be overkill. (I can't imagine it being very useful.)
As is decently well known, plain TeX's “underscore” character is actually
It looks nice enough, but when copying and pasting text that has an underscore, it turns into a space. The usual way to fix this is to mess with text encoding, and I figured dealing with that would not be worth the trouble. However, I recently found out about a PDF feature called ActualText that allows specifying alternative text for an element of the page. See for example this question on TeX.SE.
Not only could this be used for making underscores copy as underscores, but it could also be used for identifiers formatted as
Tex
, since they could render completely differently from how they appear in the C source code. Whether it would be worth doing for characters like ∧ and ¬ (to make them copy as&&
and!
) is up for debate.Is this capability something that belongs in
CWEB
, perhaps as an option (so that extra TeX would not be output if PDF is not the target)? Would it be desirable? If so, then I will implement it.