artisticat1 / obsidian-tikzjax

Render LaTeX and TikZ diagrams in your notes
MIT License
457 stars 27 forks source link

Support Offline Operation #3

Closed iamrecursion closed 2 years ago

iamrecursion commented 2 years ago

Currently, Obsidian Tikzjax fetches TikZJax from a CDN. This means that if the internet is unavailable for whatever reason it's impossible to render diagrams. Would it be possible to include the script with the plugin?

iamrecursion commented 2 years ago

Oh, please ignore. I missed the section at the bottom of the readme where you mention working on bundling the files. I'm really sorry for the noise!

artisticat1 commented 2 years ago

No problem! It's probably easy to miss, being at the bottom of the readme.

We can keep this issue open to track it / for visibility.

iamrecursion commented 2 years ago

Okay, that'd be brilliant! Might I ask what the current sticking point is? I may be able to help some if I know where you're at.

artisticat1 commented 2 years ago

I'd definitely appreciate any help!

Currently, building TikZJax produces an entire folder of assets (with some files referencing each other). I think the best approach would be to build TikZJax, but instead compile everything into a single .js file. This file could then easily be inlined in the plugin, enabling offline access. However, I'm not familiar with webpack, so I'm not certain on how to do this.

The entry point of TikZJax is tikzjax.js, which sets up a Worker from run-tex.js. I think we might need to inline the run-tex.js code here somehow?

In addition, run-tex.js calls fetch to fetch all the TeX files it needs to compute the TikZ diagram output. These files include core.dump.gz, tex.wasm.gz, and all the TeX package files in the tex-files folder. Using fetch won't work if we want TikZJax to work offline. So I believe we'll need to replace that fetch call, and inline all those files here too.

iamrecursion commented 2 years ago

Currently, building TikZJax produces an entire folder of assets (with some files referencing each other). I think the best approach would be to build TikZJax, but instead compile everything into a single .js file. This file could then easily be inlined in the plugin, enabling offline access. However, I'm not familiar with webpack, so I'm not certain on how to do this.

I'm somewhat unfamiliar with TikZJax. Does it compile purely to JS, or does it also have a WASM blob component? Either way, Webpack can handle bundling it into a single file as far as I know. You can probably do this as a post-build step if you specify the entry point.

The entry point of TikZJax is tikzjax.js, which sets up a Worker from run-tex.js. I think we might need to inline the run-tex.js code here somehow?

Maybe I'm missing something, but why are you saying it'd need to be inlined? Is it due to the fact that there are issues with workers, or just a matter of the bundling?

In addition, run-tex.js calls fetch to fetch all the TeX files it needs to compute the TikZ diagram output. These files include core.dump.gz, tex.wasm.gz, and all the TeX package files in the tex-files folder. Using fetch won't work if we want TikZJax to work offline. So I believe we'll need to replace that fetch call, and inline all those files here too.

I think it'd definitely need to be replaced, but beyond the need to work offline I don't know if the Obsidian plugins system provides any mechanism for downloading data files in addition to the standard main.js, styles.css and manifest.json. Do you know if you can bundle additional files to be downloaded with a plugin, or would it be a matter of base64 encoding the data?

Such a thought actually gives rise to a potentially more simple stopgap solution if working out how to webpack it all proves problematic. Could you potentially base64 encode the entire archive into main.js and then unpack it on the plugin's first load?

artisticat1 commented 2 years ago

I'm somewhat unfamiliar with TikZJax. Does it compile purely to JS, or does it also have a WASM blob component? Either way, Webpack can handle bundling it into a single file as far as I know. You can probably do this as a post-build step if you specify the entry point.

I see, thank you! TikZJax compiles to two .js files, which load a gzipped .wasm file.

Maybe I'm missing something, but why are you saying it'd need to be inlined? Is it due to the fact that there are issues with workers, or just a matter of the bundling?

Ah, I could be wrong here. I was just thinking that loading another file from a URL (${urlRoot}/run-tex.js) would be problematic.

I don't know if the Obsidian plugins system provides any mechanism for downloading data files in addition to the standard main.js, styles.css and manifest.json.

It doesn't, no. The standard approach to bundling extra assets with a plugin appears to be base64 encoding them/inlining them, as you say.

Such a thought actually gives rise to a potentially more simple stopgap solution if working out how to webpack it all proves problematic. Could you potentially base64 encode the entire archive into main.js and then unpack it on the plugin's first load?

That definitely seems possible. (Alternatively, we could have the plugin download the TikZJax files on first use and store them in the plugin folder.)

This still leaves the issue of replacing that fetch call, though, since fetching local files doesn't work. I thought about replacing it with something like TikzjaxPlugin.app.vault.adapter.readBinary(file), which returns the contents of file, where file is a file inside the user's vault. However, TikZJax doesn't have access to TikzjaxPlugin, or the Obsidian App, so I'm not sure that that's possible.

Can we get Obsidian to host/serve the files somehow (so that fetch works)?

iamrecursion commented 2 years ago

I see, thank you! TikZJax compiles to two .js files, which load a gzipped .wasm file.

This sounds like a fairly trivial application of webpack or similar. I think the argument you want is -s SINGLE_FILE=1. I'm not currently in a position to try this myself, unfortunately!

Ah, I could be wrong here. I was just thinking that loading another file from a URL (${urlRoot}/run-tex.js) would be problematic.

Ah, yes this is likely to cause problems, so I agree that it'd need patching. Would it perhaps be possible to maintain a set of patches? Or are you building from your fork already and hence it's even simpler?

It doesn't, no. The standard approach to bundling extra assets with a plugin appears to be base64 encoding them/inlining them, as you say.

I thought not. How annoying. We at least know that such an approach probably works; we'd not be the first to do it here.

That definitely seems possible. (Alternatively, we could have the plugin download the TikZJax files on first use and store them in the plugin folder.)

Downloading them on the first use may well be a good option. The only issue here is that we'd need to have some kind of updater logic to check for updates to those files and grab new ones if there are. I can think of a few ways to make this less burdensome for users, but that's to visit if we decide to go down this route.

This still leaves the issue of replacing that fetch call, though, since fetching local files doesn't work. I thought about replacing it with something like TikzjaxPlugin.app.vault.adapter.readBinary(file), which returns the contents of file, where file is a file inside the user's vault. However, TikZJax doesn't have access to TikzjaxPlugin, or the Obsidian App, so I'm not sure that that's possible.

I think that's unlikely, unfortunately. You might get it to work on desktop, but I think content security policies would potentially make it hard. I don't think it would work on mobile at all.

Which means probably needing to patch it somehow. What's the entry point to the tikzjax code like? Could we add an additional argument to the initialisation that gives access to TikzjaxPlugin and hence App?

artisticat1 commented 2 years ago

This sounds like a fairly trivial application of webpack or similar. I think the argument you want is -s SINGLE_FILE=1. I'm not currently in a position to try this myself, unfortunately!

I'll give it a shot soon!

Ah, yes this is likely to cause problems, so I agree that it'd need patching. Would it perhaps be possible to maintain a set of patches? Or are you building from your fork already and hence it's even simpler?

Yep, I'm building from a fork, so no problems with that.

Downloading them on the first use may well be a good option.

Indeed. An advantage of this approach is that users can easily add any extra LaTeX / TikZ packages they want to TikZJax, simply by copying the package files to the tex_files folder.

The only issue here is that we'd need to have some kind of updater logic to check for updates to those files and grab new ones if there are. I can think of a few ways to make this less burdensome for users, but that's to visit if we decide to go down this route.

Ah, good point. Perhaps we could have a setting to auto-check for updates every so often.

This still leaves the issue of replacing that fetch call, though, since fetching local files doesn't work. I thought about replacing it with something like TikzjaxPlugin.app.vault.adapter.readBinary(file), which returns the contents of file, where file is a file inside the user's vault. However, TikZJax doesn't have access to TikzjaxPlugin, or the Obsidian App, so I'm not sure that that's possible.

I think that's unlikely, unfortunately. You might get it to work on desktop, but I think content security policies would potentially make it hard. I don't think it would work on mobile at all.

Which means probably needing to patch it somehow. What's the entry point to the tikzjax code like? Could we add an additional argument to the initialisation that gives access to TikzjaxPlugin and hence App?

Not sure whether you meant to reply to this or my question beneath it about hosting the files. I'll assume the latter, since it makes more sense considering you ask about App afterwards. Correct me if I'm wrong!

I just discovered that we can actually access App via window.app -- so TikZJax does actually have access to it.

This seems promising, since we could then replace fetch(file) with window.app.vault.adapter.readBinary(file). The only problem with this is that this code is run inside a web worker, and web workers don't have access to window. So it seems like this approach would be a bit more challenging (think I'd have to transfer the file data to the worker via worker.postMessage or something).


In summary, it looks like there are two approaches we could take:

A. webpack everything into a single .js file, base64 encoding and inlining files where necessary. We can then inline this .js file in the plugin, enabling offline access.

B. Download/unpack and store the TikZJax files in the plugin folder, and have TikZJax run from those files. Patch the relevant function calls so that files aren't being fetched from URLs (might be a bit complex). Enables users to easily add their own LaTeX packages, but we'll need to think about how we distribute updates.

I think option A looks a bit easier and like the "cleaner" solution to me. So I think I'll give it a try soon and see how it goes!

iamrecursion commented 2 years ago

I think (A) is definitely the cleaner solution, but I feel like it's not quite as simple as it seems. Unless I'm missing something, you still need to patch a few calls in TikZJax in order to remove the fetch calls before bundling things, no? Or are you counting that under the "base64 encoding and inlining files where necessary"?

I should have a bit more time in a few days, so if you keep this issue updated with any progress you have I should be able to muck in then!

artisticat1 commented 2 years ago

Or are you counting that under the "base64 encoding and inlining files where necessary"?

That's what I meant, yes.

I've made progress! I managed to bundle everything into a single file and got the plugin working offline. The way I've done the bundling isn't very clean or elegant, though, and could probably be improved. I'll explain my approach here.

There are two places in TikZJax we need to patch:

  1. The fetch call inside loadDecompress.

To patch this I created a dictionary, where the keys are filenames of TikZJax assets, and the values are the base64 encoded contents of said files, like this; image

I then import this dictionary in run-tex.js. I replaced the fetch call so that whenever a file needs to be fetched, we simply lookup its base64 encoded value in the dictionary instead, obtaining the file that way.

This works! But it's probably a bit dirty. Having learnt slightly more about webpack now, I think we should probably use one of the loaders provided by webpack (or asset modules?) instead -- does that sound right?

  1. The initialisation of a web worker inside tikzjax.js created from run-tex.js.

Fortunately, webpack comes with a number of ways to bundle web workers.

In the end, I resorted to manually inlining the run-tex.js file output by webpack after a build, and using this method to create a Worker from a string of code. Not sure if you have any better ideas or suggestions here.

iamrecursion commented 2 years ago

I've made progress! I managed to bundle everything into a single file and got the plugin working offline. The way I've done the bundling isn't very clean or elegant, though, and could probably be improved.

Though it's a bit hacky, the fact that it works is great! As long as the dictionary build is automated it seems like a perfectly fine starting point to me. That said, you're right that we could probably do it more cleanly with webpack. To my understanding, we want the asset/inline method for v5's asset loader, as you've pointed out.

In the end, I resorted to manually inlining the run-tex.js file output by webpack after a build, and using this method to create a Worker from a string of code. Not sure if you have any better ideas or suggestions here.

Unfortunately I don't. I've spent some time digging and I've come to the exact same conclusion as you. If you want to bundle it all into a single file, we end up stuck with loading it from a string or blob. To that end, it may be best to treat the worker as another asset for the asset loader, that way there's a unified method to inline the worker into the file. From there you can load it using the Blob constructor which is taken by URL.createObject as described in the SO answer you linked to.

iamrecursion commented 2 years ago

Oh, and I forgot to say! It's amazing that you've made so much progress! I don't know if my code-level help is at all useful at this point, given I'm now available to help! Let me know if there's any way I can help beyond continuing to be a sounding board here!

artisticat1 commented 2 years ago

To my understanding, we want the asset/inline method for v5's asset loader, as you've pointed out.

Ah yes, thank you! I suppose we'd replace the values in the dictionary with the file contents not as inlined base64 encoded strings, but as variables holding the base64 encoded contents, each imported via

import tikzjax_file_name from './../tex_files/tikzjax_file_name.gz';

To that end, it may be best to treat the worker as another asset for the asset loader, that way there's a unified method to inline the worker into the file.

Ah, that's what I'm doing! Sorry I neglected to mention it. I ended up creating two separate webpack configs to ensure that the run-tex.js worker is built before tikzjax.js, which depends on run-tex.js being built so it can inline it.

Let me know if there's any way I can help beyond continuing to be a sounding board here!

Will do :)

Another minor issue: the final bundle output by webpack in production mode is 6.6 MB (11.2 MB if we include fonts!), which is a bit hefty. There probably isn't much we can do to reduce this, is there? (The un-bundled files come to 5.1 MB and 8.6 MB including fonts, so that's an extra 2.6 MB coming from the base64 encoding.)

iamrecursion commented 2 years ago

Ah yes, thank you! I suppose we'd replace the values in the dictionary with the file contents not as inlined base64 encoded strings, but as variables holding the base64 encoded contents, each imported via

import tikzjax_file_name from './../tex_files/tikzjax_file_name.gz';

Yes exactly! It'd certainly make things a bit cleaner was webpack then handles a bunch of the gubbins for us.

Ah, that's what I'm doing! Sorry I neglected to mention it. I ended up creating two separate webpack configs to ensure that the run-tex.js worker is built before tikzjax.js, which depends on run-tex.js being built so it can inline it.

I guess this is the point at which package.json needs to start including commands for this kind of thing :'D

Another minor issue: the final bundle output by webpack in production mode is 6.6 MB (11.2 MB if we include fonts!), which is a bit hefty. There probably isn't much we can do to reduce this, is there? (The un-bundled files come to 5.1 MB and 8.6 MB including fonts, so that's an extra 2.6 MB coming from the base64 encoding.)

Yeah fundamentally I don't think there's much we can do for this. Base64 encoding is pretty efficient as a textual encoding, so unless we can pack things further (which I don't think we can) it's just kinda "how things are". The whole download things method would be a bit smaller on disk in the end, but much more maintenance-heavy for users, so I think where we are is definitely the optimal solution.

artisticat1 commented 2 years ago

Yeah fundamentally I don't think there's much we can do for this. Base64 encoding is pretty efficient as a textual encoding, so unless we can pack things further (which I don't think we can) it's just kinda "how things are". The whole download things method would be a bit smaller on disk in the end, but much more maintenance-heavy for users, so I think where we are is definitely the optimal solution.

I see!

Good news -- I tidied up the inlining of assets with webpack, and with that, everything's complete! I just released v0.3.0 of the plugin, which supports offline operation.

Thanks for all your help :D

iamrecursion commented 2 years ago

Amazing! Now I can start to poke at #4!