asciidoctor / asciidoctor-diagram

:left_right_arrow: Asciidoctor diagram extension, with support for AsciiToSVG, BlockDiag (BlockDiag, SeqDiag, ActDiag, NwDiag), Ditaa, Erd, GraphViz, Mermaid, Msc, PlantUML, Shaape, SvgBob, Syntrax, UMLet, Vega, Vega-Lite and WaveDrom.
http://asciidoctor.org
MIT License
445 stars 109 forks source link

Changes in Gnuplot `load`ed files do not show up #436

Closed ManDay closed 1 year ago

ManDay commented 1 year ago

Changes in a Gnuplot loaded file only show up if the .asciidoctor/diagram cache gets deleted. To reproduce, create two files, test.adoc

= TEST

[gnuplot]
....
load "test.gp"
plot '-'
1 2
3 4
....

and test.gp

unset border

compile using asciidoctor -a data-uri -r asciidoctor-diagram -o ./result/test.html test.adoc; the plot will show without border. Comment out the line, # unset border, and run the same asciidoctor command again. The plot will remain without border.

ManDay commented 1 year ago

I would guess that the same applies to changes in plotted datasets. Short of some (I suppose rather sophisticated) detection of linked resources by asciidoctor-diagram, it seems like the only solution is to not cache anything which could link untracked files?

It's not great, but all alternatives (including the one where the user has to explicitly designate a diagram as "no-cache") seem worse.

ManDay commented 1 year ago

I figure I should justify why I file this as a bug, given that you're obviously already aware of it.

My reasoning is that, principally, cache is an implementation detail which is invisible to the user who only reads the documentatino about in- and out-data (i.e. creating diagrams in asciidocs is the only public semantic and how asciidoctor-diagram represents itself).

You could choose to promote the workings of the cache to a top-level priority, describing it and the managing requirements for the user right on the first page along with how asciidoctor-diagram creates diagrams. But (fortunally), you don't, so the normal user isn't expected to know anything about it. Therefore, for the normal user, in a normal usecase (which includes, say, plotting data sets and loading gnuplot headers, in my opinion), the above behaviour is a serious aberration to the point where they are unable to use asciidoctor-diagram for their purpose. Unless they become acquainted with the implementation details of the cache.

I think only the opposite version would remedy the situation, where the cache is a-priori disabled, but the user can choose to become an advanced user with knowledge about how the cache works. They can then activate the cache (globally, by CLI, or per-diagram?) to achieve better performance, as their better understanding allows them to.

I hope this makes sense to you, too. I know that disabling the cache by default sounds bad at first, but given that one can "earn" it back by learning how to manage the cache, this seems like the better option than demanding that the user learn how to manage the cache anyway, because otherwise they risk (unknowingly) getting the wrong results.

pepijnve commented 1 year ago

It's a general issue for any syntax that can reference content in other files. To cache correctly the extension would need to interpet the diagram itself to find all the dependencies and check if any have changed. That requires interpreting the syntax itself, and that's something I really don't want to get in to. I tried that with PlantUML and it proved to be a very bad idea. You don't want to reimplement the tool you're calling out to.

I'm reluctant to disable the cache entirely by default. It was added for good reason. When working on the diagrams themselves it can get in the way, but when authoring the surrounding text and rerendering the final document it prevents the rendering process from getting prohibitively slow when your document contains many diagrams. Damned if you cache, damned if you don't... My assumption was always that if the cache gets in the way, it's trivially solved by an rm -rf .asciidoctor-diagram before rendering. There should be some mention of this in the documentation then.

It does make sense to provide a means of controlling the caching behaviour though. I'll start by adding that in the usual way via document attributes.

ManDay commented 1 year ago

My assumption was always that if the cache gets in the way, it's trivially solved by an rm -rf .asciidoctor-diagram before rendering.

I get what you mean, but I'm worried that in practice a normal user will not notice when it happens. I was still in the process of figuring out some plotting style and edited obvious properties, so it became apparent to me when it stopped working (but it took me a while to understand what's going on!).

What if someone who is confidently using asciidoctor, oblivious to the cache behaviour, updates, say, their data set to add new data they have acquired? They will think everything is okay and publish the faulty result. That's why I think the default should be not-caching. The user can equally easily just call asciidoctor with -a diagram-cache or something like that, if we're taking the case which you described, where the surrounding text changed.

That would make the user responsible for making sure the cache works as intended. The documentation about what has to be minded would be written in the documentation of :diagram-cache:. A stale cache would be the fault of the user because they didn't respect the documentation of the very option they are using. Whereas right now, the user gets erroneous results without being at fault.

pepijnve commented 1 year ago

This gem has existed for almost 10 years now and the cache has been in there from the very beginning. If it was that big of an issue, I would expect to have received many more complaints than just a single one. A sample of one is not sufficient for me to change the default behaviour.

ManDay commented 1 year ago

I didn't hope you would change it over the number of complaints, but because it makes sense.

I'll alias asciidoctor to purge the cache, so I'll be fine, but I'm sorry to hear that you don't take this under consideration.

pepijnve commented 1 year ago

I am taking it into consideration, but everything's a trade-off. For first renders or CI jobs the cache is a non issue. For local incremental renders, not caching makes most situations unusably slow on any non trivial document. Making that case fast by default is a reasonable default to me. My working assumption is that most edits are made to the main document content itself, not the diagrams.

Providing the option to not cache and documenting the default setting seems like a good improvement.

Changing the default behaviour impacts all current users so I'm really not willing to change that in a point release.

pepijnve commented 1 year ago

2.2.13 will let you disable caching (and any old matching cache file also gets cleared as part of this) by setting the document attribute diagram-nocache-option, gnuplot-nocache-option (for gnuplot specifically), or you can add %nocache or opts=nocache at the individual block level.