James-Yu / LaTeX-Workshop

Boost LaTeX typesetting efficiency with preview, compile, autocomplete, colorize, and more.
MIT License
10.72k stars 533 forks source link

[Proof of concept] In-built Latex compilation #119

Closed jabooth closed 5 years ago

jabooth commented 7 years ago

My experiments with shipping a cross-platform minimal LaTeX toolchain have gotten pretty far now, and I have a working prototype:

https://github.com/latexjs/latexjs

The README there hopefully does a reasonable job of explaining the idea of the project, and what it presently can and cannot do. I'll now try and break down what this is and what I think we could do with it wrt LaTeX Workshop:

What is Latexjs?

To summarise, the idea is to make a variant of TeX Live which has the following properties:

  1. The 'binaries' (pdflatex etc) are actually Node JS modules (cross compiled with Emscripten). This has three benefits:

    • You get consistent behaviour across platforms (no MikiTeX vs MacTeX vs TeX Live differences, or even Windows vs Linux vs macOS differences!).
    • It's easy to ship something that reliably works (one binary runs on all three platforms)
    • You have to opportunity to add a useful caching FS (see point 2 below) which would be very challenging to do with native binaries.
  2. Only the required exact files from TeX Live are downloaded for compilation. This is a remarkably small amount in my experience. Even working across different packages, I don't think most users Latexjs 'installs' would go above 100MB for everything. Mine is currently 42MB.

  3. The whole toolchain is in one directory (~/.latexjs by default). Removal is as simple as deleting that directory. We do not run an installer, touch the registry in windows, or the $PATH in UNIX.

  4. Updates are handled by the toolchain itself, including upgrades between TeX Live versions. All files are checksummed so we can verify users installations and have easy reliable upgrades.

How does this fit in to LaTeX Workshop?

The above properties change the dynamic of installing a LaTeX distribution considerably. 'Installation' is the same across all platforms, and only downloads around 5MB of files to a single directory that can be easily removed. After install we can depend on having a Latex compile toolchain that works exactly the same way across all platforms. If we had confidence such a solution worked reliably and in a performant manner, it could be possible for LaTeX Workshop to include support for downloading and managing Latexjs automatically, so for end users Latex compilation 'just works' with LaTeX Workshop installed and nothing else.

What are the current downsides/limitations?

  1. Latexjs is untested. I've been trying it out on my macOS/Linux/Windows 10 boxes, but until it's been tested by some intrepid beta testers in more scenarios it's hard to know if there are edge cases we haven't considered.

  2. Performance is not as good as native. Presently it's around 2-3x slower than using native solutions. This will improve quite significantly with WASM (Web Assembly) as currently we pay a fairly large startup cost in interpreting the asm.js code. I'm also no expert on Emscripten, and I'm sure we can get someone with more expertise to help us profile and get it faster. I think we can aim realistically for a sub 1.5x performance penalty..which given the significant upsides of having a turnkey solution, might be enough for many users to consider this instead of having to install maintain and update a full Latex installation.

  3. Servers are required. I'm currently paying to run three $5/month instances whilst we test all this. I don't think the running costs will be massive (everything is compressed and only the absolute minimal required is presently downloaded), but longer term I would have to figure out a model to keep this running and up to date for the community if we feel it is useful.

So what is this issue about then?

For now, Latexjs can be tested with LaTeX Workshop without any modification to the extension. To test this, users can just follow the installation instructions in the README, and then set their toolchain configuration as follows:

macOS example:

    "latex-workshop.latex.toolchain": [
        {
            "command": "node",
            "args": [
                "/Users/jabooth/.latexjs/apps/latex.js",
                "compile",
                "./thesis.tex"
            ]
        }
    ]

Windows example:

    "latex-workshop.latex.toolchain": [
        {
            "command": "node",
            "args": [
                "C:\\Users\\jabooth\\.latexjs\\apps\\latex.js",
                "compile",
                "thesis.tex"
            ]
        }
    ]

A few notes:

  1. This requires a node interpreter on the $PATH. If we were to integrate with LaTeX workshop we would use the node interpreter shipped with VS Code instead.
  2. The path to the file has to be relative to the workspace and all included files need to be inside the current workspace.

I'm basically raising this issue to see if others think that this approach is interesting and something we should pursue. I would love it if others who were willing started to test out Latexjs and report issues for anything that doesn't work (please report on latexjs/latexjs and not here!). Beware this is a beta, so backup first and ideally test in a VM! I've been using it extensively the last few weeks without issue though.

If we find that people like the model and find that in testing it's working well, we could consider moving forwards with 'productising' it into something that just works.

Really interested to hear people's thoughts!

jabooth commented 7 years ago

OK @James-Yu that's updated now! Sorry I went off into the wilderness to work on this for a few weeks, really interested to hear what you think!

tmblazek commented 7 years ago

Reporting in as tester. I certainly know secondary PCs where "latex-on-demand installation" would be beautiful.

tmblazek commented 7 years ago

run 2 ieee template conf papers and a beamer presentation through, so far nothing out of the ordinary, and the installation takes up 49M.

jabooth commented 7 years ago

That's great to hear @tmblazek, thanks so much for the feedback. Few Q's if you don't mind:

  1. What platform are you on?
  2. How does the performance 'feel' - is it good enough to be workable as a first version in your opinion? Obviously we want to get it faster, just don't know whether to roll something out sooner and get people testing and interested, or hold back till I can get the performance improved (you only get one chance to make a first impression, don't want that to be 'well this is too slow')
tmblazek commented 7 years ago
  1. macOS sierra. But I also have a Linux install, may try to set that up too.
  2. Hm for my applications (5-10 pages, not tooo many heavy load) on quite powerful MacBook Pro, it's a couple of seconds either way. Might be 3-4 vs 10 seconds, but nothing drastic. Nothing that feels like I couldn't switch.

Edit: And point 2. was for two runs. if that isn't needed it's about half, so negligible.

James-Yu commented 7 years ago

Nicely done! Let me think how we can integrate that in a non-intrusive way...

jabooth commented 7 years ago

Thanks @James-Yu!

Yes I've been thinking about this, and this would be my current suggestion:

  1. We make a PR which adds a single new configuration option for now:
    "latex-workshop.latexjs.enable": true/false

    which of course would default to false, and be clearly documented as a beta feature.

For simplicity, for now I would say that enabling latexjs is a binary choice. If it's on, the usual latex toolchain setting is ignored, and instead latex workshop will call the latex.js compile command whenever compilation is needed. Every time this happens, if latexjs is not installed, the latex.js file is download to a temp file and executed using the running node interpreter to set up the toolkit.

If we did this, and did some testing with a few beta testers without major issues, we could role it out in an update to the marketplace, but we could only mention it's existence in the changelog (and be very clear it's a beta!). Some of our more technical passionate users would perhaps see it in the changelog and give it a go, and be able to offer us feedback. I don't think we should mention it in the README at all until it's undergone more testing, we only want more technical people to give it a go until we are sure it works well.

The nice thing with this approach in my opinion is that:

  1. It's a fairly minimal change to the extension. By having it be on/off for now, it keeps it simple. In the future if all goes well we might want to have latexjs be a fallback, so users can always compile for sure, but if they have a native latex installation working we prefer that. I would suggest it's better to start simple with the binary enable/disable though and work up to this (we can always add more options under the "latex-workshop.latexjs.*" namespace!)
  2. Only technical users will find it and try it out to start with
  3. If that one configuration flag is disabled, LaTeX Workshop completely ignores Latexjs and behaves as normal.

I would also suggest we look at: https://marketplace.visualstudio.com/items?itemName=ms-vscode.cpptools

as an example of an extension that downloads a good about (50MB or so I think) of support files on launch.

James-Yu commented 7 years ago

I was trying to implement the said feature but had a problem. It seems that the nodejs bundled in vscode is not accessible from the extension. Any idea? @jabooth

jabooth commented 7 years ago

I had success using 'process.execPath' at least on macOS:

https://nodejs.org/api/process.html#process_process_execpath

Is this what you tried @James-Yu?

James-Yu commented 7 years ago

It points to the executable binary of vscode on my Win. I need to find out if it can be used as a nodejs executable.

edit: code latex.js opens the file. Yet to find a way to access node.

jabooth commented 7 years ago

How about 'child_process.fork'?

https://nodejs.org/api/child_process.html#child_process_child_process_fork_modulepath_args_options

James-Yu commented 7 years ago

Seems not working well,

const currentProcess = cp.fork(path.resolve(path.dirname(root), 'latex.js'), ['install'])

currentProcess.on('error', err => {
    console.log(err.message)
})

currentProcess.on('exit', (code, signal) => {
    console.log(code, signal)
})

outputs 1 null, and no stdout or stderr is given.

tamuratak commented 5 years ago

I close this issue. Feel free to reopen if any progress.