Expandable \input (possibly with file lookup and hooks)

PhelypeOleinik commented 3 years ago

Brief outline of the enhancement

LaTeX's \input has always been troublesome when used inside tables due to problems with TeX's scanning ahead for \omit and \noalign for being non-expandable. Some borderline cases, like #473 or this tex.sx question, which previously worked due to how \input was implemented, now also showed up.

Currently \input does:

\@ifnextchar\bgroup to differentiate the \input{<file>} and \input <file> syntaxes;
Checks if the <file> exists, and if it doesn't raise an error asking for another name;
If the file exists, sets \CurrentFile;
\@addtofilelist{<file>};
\UseHook{file/before} and \UseHook{file/before/<file>};
\@@input <file>;
\UseHook{file/after/<file>} and \UseHook{file/after};
Resets \CurrentFile.

From the steps above, 3, 4, and 8 cannot be done expandably at all, that is, setting \CurrentFile and \@addtofilelist. The latter is unlikely to be useful in cases where an expandable \input is useful (most commonly tables), which only leaves us without \CurrentFile in an expandable context.

Step 1 can be done expandably with xparse, now integrated in the kernel, step 2 can be done with \file_full_name:n (with file lookup in \l_file_search_path_seq and \input@path, as standard \input do), and rely on TeX to handle the missing file error, and the hooks are already expandable by design, so the implementation of a functionally-equivalent expandable \input wouldn't need too much beyond what we already have.

What is needed is a good interface. Here are a few options:

A brand-new command to be used in these places. Clean an easy, but requires document changes
Some clever detection of table cells with \currentgrouptype. Clever, but prone to false-positives and doesn't cover other expandable contexts
A new key-value argument to \input to signal when expandability is required. For example \input[expandable]{<file>}. Clean, clear in intention, and easily extensible if needed, but incompatible with the legacy \input <file> syntax for files whose name start with [ (is this a thing?)
A starred version of \input that selects the expandable version. Also clean (maybe not so clear in intention), not so wordy as option 3, but also not easily extensible. Also incompatible with legacy syntax for files starting with * (even weirder than [, so I think it's okay).

Preferences toward one of the options? Other options?

blefloch commented 3 years ago

I notice another option: tell users to use the already currently working

\input filename

syntax since when there are no braces the primitive is used, if I understand correctly. I have no opinion on what's best.

josephwright commented 3 years ago

Step 1 can be done expandably with xparse, now integrated in the kernel

I'm not sure how you mean. Without an assignment, all we can do is grab an argument up-to left brace and see if that's empty. That's explicitly not in ltcmd. The 'expandable optional' stuff relies on grabbing an argument, and can't tell the difference between \foo{a} and \foo a (let alone \foo a\relax).

josephwright commented 3 years ago

I notice another option: tell users to use the already currently working \input filename syntax since when there are no braces the primitive is used, if I understand correctly. I have no opinion on what's best.

Not currently true as \input as a LaTeX2e command is not expandable: one has to use \@@input.

davidcarlisle commented 3 years ago

\@ifnextchar\bgroup to differentiate the \input{<file>} and \input <file> syntaxes;

one option you didn't mention is (on platforms where you have added the primitive \input{..} just provide a no @ version of \@@input and let the engine handle the braces.

PhelypeOleinik commented 3 years ago

I notice another option: tell users to use the already currently working \input filename

The primitive is used, but it's not expandable because of \@ifnextchar, which solves the problems in the linked issues (which is good!) but still doesn't allow lookup nor hooks.

PhelypeOleinik commented 3 years ago

one option you didn't mention is (on platforms where you have added the primitive \input{..} just provide a no @ version of \@@input and let the engine handle the braces.

But LaTeX \input does a lot more than that, so this is only better than \@@input <file> because it allows spaces

PhelypeOleinik commented 3 years ago

I'm not sure how you mean. Without an assignment, all we can do is grab an argument up-to left brace and see if that's empty. That's explicitly not in ltcmd. The 'expandable optional' stuff relies on grabbing an argument, and can't tell the difference between \foo{a} and \foo a (let alone \foo a\relax).

Sorry, I mixed stuff up. Yeah, not in ltcmd: it would need a dedicated parser. But I forgot about single-character file names :(

u-fischer commented 3 years ago

While the hook code itself is expandable, this doesn't need to be the case for code in the hook. For example the structuredlog package adds non expandable code to file/before and file/after. Imho it would be better if at least the generic file hooks weren't used in such an expandable input command.

Regarding the syntax I have a slight preference for \input[expandable]{<file>}, as it would keep the option open for other variants (e.g. nohooks).

blefloch commented 3 years ago

For the parsing issue with one-character file names, what could be done is to require that people do not follow such a single-character \input{a} directly by some text, namely we could decide that \input{a}bcd now tries loading the file "abcd". I expect that people typically have spaces after \input{a}?

Grab one macro argument.
If it is an open bracket "[", switch to how xparse expandable commands parse things to grab [...]{...} and then use the (expandable?) keyval parser to cut apart the content of the argument.
Now we are allowed non-expandable things. If the argument is multiple tokens (or empty, or a single space) then it was braced, so treat it as the file name.
Check with \futurelet whether the following token is a character: if it is, then assume that we had the primitive syntax \input abcd. Otherwise, assume that we had \input{a}, for instance \input{a}\input{b} would work correctly.

davidcarlisle commented 3 years ago

@PhelypeOleinik yes but as Ulrike comments it may be best not to do any of the additional things as you can't easily control what is in the hooks. If you just had \let\expandableinput\@@input then with new engines you'd be able to use \input{tablebody} in an expandable way and that may be enough.

@blefloch maybe I just misunderstood the context you are assuming but I didn't understand "3 Now we are allowed non-expandable things. " for the table use, you can't do anything non expandable before the content of the file is seen?

PhelypeOleinik commented 3 years ago

@davidcarlisle Yeah, maybe generic file/before and file/after could be left out. The advantage of the key-val syntax is that you can do \input[with-hooks] or whatever if you need hooks or otherwise. The downside of \let\expandableinput\@@input is that it doesn't do file lookup where usual \input would, so the expl3 wrapper would be good to have, at least. Then \input[no-search] can be easily done.

What Bruno means with item 3 is that the expandable branch of \input would only be taken if \input[expandable] were used. If there is no [, then just do things the usual way

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity.

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity.

latex3 / latex2e

Expandable \input (possibly with file lookup and hooks) #514

Brief outline of the enhancement