tectonic-typesetting / tectonic

A modernized, complete, self-contained TeX/LaTeX engine, powered by XeTeX and TeXLive.
https://tectonic-typesetting.github.io/
Other
3.99k stars 162 forks source link

Handle shared resources for multiple files #985

Closed mskblackbelt closed 1 month ago

mskblackbelt commented 1 year ago

To give some context for this matter, allow me to describe my situation. I'm trying to write up a series of laboratory guides for students. Maybe someday these are compiled as part of a book, but for now, they're just a series of files using a shared format (really just a shared header and bib file). Part of this setup also sets the font to use the STIX2 text and math fonts (and I'd like to use Inconsolata or Fira Code as the monospace font).

Some of this matter was discussed in issue #933, and I was able to work around a bit of it by using the shell_escape_cwd = ".." keyword (shared header and bib file now work). However, this litters the root directory with temporary TeX files specific to each lab report. Cleaning isn't that big an issue, but it is an additional issue spawned by this workaround. What I can't get to work is using the fontspec package to specify the STIX2 fonts. This used to work, but now I get errors like

warning: open of input /Users/<username>/Projects/357_experiment_guides/guide_src/shared/STIXTwoMath-Regular.otf:0-UCS32-Add failed
caused by: access to the path `/Users/<username>/Projects/357_experiment_guides/guide_src/shared/STIXTwoMath-Regular.otf:0-UCS32-Add` is forbidden

These fonts are installed by the system, but trying to use the system locations results in the same issue. I placed them in the shared resources folder in my project in an attempt to circumvent this problem, but was unsuccessful.

Any suggestions on a better way to proceed are welcome. I love the compilation speed of Tectonic and would like to incorporate all of this into a reproducible resource for others, but I certainly can't release it in the current state.

mskblackbelt commented 1 year ago

For reference, I'm using Tectonic 0.12.0 on macOS 13.0.1 on an M1 Mac. The StixTwo fonts are installed natively at /System/Library/Fonts/Supplemental, but I've also downloaded the static OTF files from the Stix GitHub repo.

vlasakm commented 1 year ago

If it is just the fonts, using the font files from Tectonic bundles seems easiest - STIX fonts are packaged on CTAN, thus bundled by Tectonic. Though sadly one has to use the right font file names, not font names. See mainly https://github.com/tectonic-typesetting/tectonic/issues/965, though https://github.com/tectonic-typesetting/tectonic/issues/9 is "the original issue".

\documentclass{article}

\usepackage{fontspec}
\usepackage{unicode-math}

\setmainfont{STIXTwoText}[
    Extension      = .otf,
    UprightFont    = *-Regular,
    BoldFont       = *-Bold,
    ItalicFont     = *-Italic,
    BoldItalicFont = *-BoldItalic]
\setmathfont{STIXTwoMath-Regular.otf}

\begin{document}

text

\end{document}

Note that now that font file names are specified correctly if the file searching works your files in parent directory should be picked up instead of those in Tectonic's bundle, which maybe you don't want. (I would personally suggest deleting the local copies and using fonts from the bundle).

Ofcourse file searching should work in general, not just for fonts and your issue is still probably valid. A minimal (not) working example would help.

mskblackbelt commented 1 year ago

I have the font options working, though the font is still pulling from my local macOS library, rather than the stix2 package from CTAN. I can't seem to find the magic incantation to get tectonic to grab the CTAN font files rather than the locally installed ones. Note that I can reference the name STIX Two Math on my system, but have to use the full STIXTwoMath-Regular to catch the file directly. Both STIXTwoText and STIX Two Text work just fine (no need to specify extension and individual styles). Neither option uses the tectonic bundle files. Is this naming behavior just because XeLaTeX was originally designed to work with the OS X font system? If so, will my font settings fail to work on a Linux system?

Thank you for the assistance on getting the fonts to work, but I still have the question/issue of how to handle local dependencies. I'd like to be able to reference a single header.tex and project.bib file for every document in the project (Lab 1, Lab 2, Lab 3, etc.), along with some font files or locally modified packages in a single Git repo. Is this possible or plausible, or is it outside the scope of this project and I need to roll my own build system (which is still a bit beyond my capabilities)?

pkgw commented 1 year ago

@mskblackbelt Thanks for your patience and persistence with this topic.

The way that I want to handle this situation in the V2 CLI is to allow workspaces to contain more than one document, and for documents within workspaces to be allowed to reference shared resource files within the workspace tree. Architecting this is a bit tricky, since I think that it is important to maintain the possibility of "detaching" a document from a workspace in a way that makes it possible to know how build it in a standalone fashion. And there's just some tedious bookkeeping of relative paths and making sure that references don't leak outside of the workspace tree.

Anyway, I want to get that support, but all of my bandwidth is currently taken up with my work on HTML output, so I'm afraid that I am not expecting to implement it any time soon. And although I keep trying to drop hints, I'm not aware of anyone else that has felt like taking that little project on ;-)

As we briefly discussed in your other issue, I think that symlinks might solve your problem pretty effectively.

The "access to the path $X is forbidden" error happens when a document references absolute paths. Have you expressed paths as absolute paths in your source file(s)? It might be that just converting them to relative paths will fix some of those issues.

Although, a separate issue that I've run into with reference fonts by filename is that, if I recall correctly, the LaTeX font handling code actually uses / as a separator and so one needs to have the font files accessible without any directory components in their names (e.g., no ../ or assets/) to avoid confusing the system.

As for the whole scheme of naming fonts by file vs. their symbolic names, this is another area that needs development work. During the bundle creation process, it would be pretty tractable to index all of the OTF/TTF fonts in the bundle and create a table of their symbolic names. We could add a layer to the font-loading code that could use that table and then avoid the requests to the OS font handling in many many use cases. I don't think there should be anything too tricky in the implementation here, someone just needs to sit down and do it.

kpym commented 1 year ago

@pkgw My suggestion is the following

  1. tectonic checks all parent folders for tectonic configuration files and collects them;
  2. rebuild the configuration using the "closest to source has higher priority" rule;
  3. the configuration file can contain a path variable consisting of a set of relative paths where tectonic will look for imports and fonts before checking the default cache.

This way :

  1. This is safe because a project can't drop a configuration in the parent folder.
  2. This is flexible because each project can override the parent parameters, but it can also benefit from the parent config;
  3. This can be a solution to problems like #8.
pkgw commented 1 year ago

@kpym That is basically exactly the idea! Someone just needs to implement it.

rm-dr commented 1 year ago

I have a similar problem, which I've resolved fairly cleanly by implementing an extra_paths value in the [doc] config table.

[doc]
name = "Test"
bundle = "https://data1.fullyjustified.net/tlextras-2022.0r0.tar"

# New config key
extra_paths = [
    "../../resources"
]

[[output]]
name = "main"
type = "pdf"

This config file lives in a sub-directory of a big "workspace", with a file structure something like this:

workspace-repo-root/
├── resources/
│   └── class files, etc      <-- All documents pull files from here
│
├── Subdir1/
│   ├── this document/
│   │   ├── src/
│   │   └── Tectonic.toml     <-- this file is above
│   └── more documents...
│   
└── Subdir2/
    └── and so on...

This scheme doesn't trigger the warning in #8 if an absolute path is passed to extra_paths, but I don't think that's a problem. Most people know that absolute paths to dependencies make hard-to-copy builds.


I'm not sure how much I like the idea of explicit multi-file workspaces. A directory structure like the one above operates as a "workspace" already, and I don't see a reason to add a top-level Tectonic.toml. What purpose would it serve?

In fact, this is very similar to how cargo works! Take tectonic, for example: every supporting crate has a Cargo.toml containing something like the lines below:

[dependencies]
tectonic_errors = { path = "../errors", version = "0.0.0-dev.0" }

These crates don't need to worry about the Cargo.toml in their parent directory, nor the Cargo.toml in ../errors.

"one toml = one crate" is a pretty solid model.


I think that it is important to maintain the possibility of "detaching" a document from a workspace in a way that makes it possible to know how build it in a standalone fashion.

I'd argue that there is no clean solution for this. The goals of "shared resources" and "easily detachable" form a bit of a contradiction. Since sub-documents pull files from the root, disconnecting a sub-document from a workspace will necessarily require a lot of file-sorting.

The solution above is, I think, one of the better ways to handle this. Dependencies are stated explicitly in extra_paths, and all you need to do to detach a document is copy resources/ and update the config.

ratmice commented 1 year ago

I'm not sure how I feel about workspaces, they seem easily gamed in that tectonic has no idea how you check out/distribute a document.

Such that a document could be within a workspace, but the root of the workspace is not commited, allowing it to reference undistributed paths. My thoughts are that if workspaces are a thing, there might be a need for some form of an tectonic -X archive command which builds a reproducible tarball, or something that tectonic -X check for verifying that the files reachable from the workspace are within a git commit?

I'm not actually convinced these thoughts are worth the effort, (would people use archive over git distribution? unlikely). Check seems okay, but it makes tectonic git opinionated, and is a decent amount of effort. So I mostly intend here to just point out that workspaces seems easily gamed to me in a way that could hinder reproducibility.

Edit: Here is a proposal which I could imagine working and not suffering from the issue where people only distribute incomplete subdirectories of a workspace. If we restrict things to having a single top-level Tectonic.toml. Currently toml files have a single [doc] entry, but if we add an alternate [workspace] key, which has a docs array, and unlike cargo those docs do not have their own Tectonic.toml but are inline in the top-level workspace. I feel that should be sufficient. Some future tectonic archive-esque command then can detach individual docs but tar up files within the top-level workspace that are required to rebuid. Curious what others think. In particular the thing I fear is if we allow workspaces/multiple Tectonic.toml's, people are going to use workarounds like making a Tectonic.toml in their home directory to bypass the intent.

ratmice commented 1 year ago

@rm-dr I made a fork/patch of your branch at https://github.com/ratmice/tectonic/tree/extrapaths which restricts it to subpaths of src_dir, using the cap_std library. Despite my own opinion that it is also the right thing to do, I feel like it has a much more likely chance of inclusion than allowing arbitrary paths. Were you planning on making a PR for your branch? (It doesn't do anything about lookups/symlinks within the extra_paths, but neither does src... that would be a bigger patch).

rm-dr commented 1 year ago

@rm-dr I made a fork/patch of your branch at https://github.com/ratmice/tectonic/tree/extrapaths which restricts it to subpaths of src_dir, using the cap_std library.

Should this be an error or a warning? I'd argue for the latter.

ratmice commented 1 year ago

Should this be an error or a warning?

It probably should, being tired I couldn't for the life of me figure out error_chain, and being tired rationalized that they'll get an error when they try and use it, e.g. none of the PATHesque variables on unix error for invalid paths in them. and TEXINPUTS in particular behaves exactly this way with touch ./foo.tex && TEXINPUTS="doesnt_exist" xelatex foo.tex, resulting in

This is XeTeX, Version 3.141592653-2.6-0.999993 (TeX Live 2021/CVE-2023-32700 patched) (preloaded format=xelatex)
 restricted \write18 enabled.
entering extended mode
! I can't find file `foo.tex'.
<*> foo.tex

So, i'd say there is some history to it behaving as I did it :shrug:...