Open siefkenj opened 7 months ago
+1
So we have an element like <latex-image xml:id="bar">FOO</latex-image>
, we checksum FOO
to abc123
, then save the result to .cache/latex-image/abc123.svg
as well as generated-assets/latex-image/bar.svg
. Then on future builds, we simply copy .cache/latex-image/abc123.svg
to generated-assets/latex-image/bar.svg
(or wherever it should be, in case the filename changes.
+1
On March 19, 2024 10:13:06 AM PDT, Steven Clontz @.***> wrote:
+1
So we have an element like
<latex-image xml:id="bar">FOO</latex-image>
, we checksumFOO
toabc123
, then save the result to.cache/latex-image/abc123.svg
as well asgenerated-assets/latex-image/bar.svg
. Then on future builds, we simply copy.cache/latex-image/abc123.svg
togenerated-assets/latex-image/bar.svg
(or wherever it should be, in case the filename changes.
I'm not sure I understand what issue this resolves. Currently, If you have an asset with xml:id="bar"
(or if bar
is the id of the youngest ancestor of the asset that has an xml:id), then we store the hash of the asset with the xml:id. If the author changes the asset, then the hashes won't match, so we ask for the asset to be regenerated (and put into the generated-assets).
With this proposal, we keep a copy of the generated asset in .cache
. If the author changes the asset, the hash will no longer match, so we regenerate the asset (an put it in .cache
and generated-assets).
In both cases, if the asset isn't changed, nothing gets regenerated.
Last case: the asset isn't changed, but the xml:id is changed. Now, the asset is regenerated. Under the proposal, the asset isn't regenerated, but a new copy is made with the new name. I see there is an advantage here, but the disadvantage is keeping every version of the generated asset in the cache and copying over every asset from the cache to generated-assets.
What am I missing?
Another potential use-case: user has <latex-image xml:id="foo">BAR</latex-image>
and later <latex-image xml:id="baz">BAR</latex-image>
. Maybe it's an anti-pattern that should have been solved with an xref
but this would avoid building the same image twice.
This would also mean images are cached without assigning an ID to them.
I'm waiting on https://github.com/TeamBasedInquiryLearning/precalculus/actions/runs/9538778663 and I'm seeing a lot of duplication of assets being generated. This could probably be avoided through cleverer configuration of the action, but I still think having a .generated-cache
directory that contains a bunch of ELEMENT/FORMAT/HASH.FMT
files that is checked before every build and copied over (barring some kind of --force-regenerate
) would be excellent.
Another use case: I change my sageplot from blue to green, then hate it, then change it back to blue. The old blue version is still cached so I get it immediately.
I am coming around to really liking this idea. I think this would be handled by core though, correct? So definitely something we will want to collaborate on.
I think this would be handled by core though, correct?
💯 - and this is a good week to do it
Caching should be used in tandem with https://docs.github.com/en/actions/using-workflows/caching-dependencies-to-speed-up-workflows to speed up CI/CD for PreTeXt projects
Currently there is some support for rebuilding assets only if they've changed, but it seems to rely on document structure. Since assets are extracted and them compiled in isolation, I imagine if you stored
<md5sum>.svg
files in some.cache
folder, you could just detect if the asset contents was the same and copy over the cached version instead of running compile again. This method would not rely on document structure at all.