aslushnikov / latex-online

Online latex compiler. You give it a link, it gives you PDF
http://latexonline.cc
MIT License
497 stars 89 forks source link

Send list of files through POST request #32

Open MonsieurV opened 7 years ago

MonsieurV commented 7 years ago

Hi again,

I have a use case that does not seem totally covered by the current API, but I'm pretty sure it could do it:

Given a json describing a list of files (for each an URL and path) and a main .tex file, compiles the PDF

Basically this is #12 and #6.

Why not use laton?

Well, this is fine to use on a terminal for compiling docs, but this is not so great when interfacing direcly in other programs. In this case, an HTTP interface is always better:

Ok, then, why not use the tar interface?

Having our program that creates a tar for each compilation is not very convenient and makes the process heavy. Again this is well when done under the wood by the laton script while workin on the terminal.

From an app that gather many files to be included (and manages itself to host them on public URLs), it is more straightforward to just construct the json payload describing these files and POST it to the service endpoint. For text files, we could even pass them directly in the payload.

POST API

Actually I found the clsi-sharelatex API well done on this part.

Using a POST method with a json payload allow us to easily pass a list of files (would be a bit clumsy to do with GET query string parameters) and may not be that hard to interpret server-side:

POST /compile

{
  "command": "xelatex",
  "target": "main.tex",
  "resources": [
    {
      "path": "main.tex",
      "content": "\\documentclass{article}\n\\begin{document}\nHello World\n\\end{document}"
    },
    {
      "path": "image.png",
      "url": "www.example.com/image.png"
    }
  ]
}

It could then return a taskId as it does with any GET call.

I think this is what the /data method do, but taking a tar instead of a json giving instructions on how to get the files.

The main question is: do you intend the service to be mainly used by humans? Or are integration on other apps are also an use case you want to support?

aslushnikov commented 7 years ago

Hey @MonsieurV,

I'm being slow to response - sorry about that, I'm on a vacation. And thank you for the detailed description! It's a pleasure to read and easy to understand.

Indeed, today's API is supposed to be human-consumed. Having proper JSON api for other applications would be nice - I wanted it for a while, but could not justify its existence since no one was interested. Could you share your scenario so that we can account for it later?

Implementation-wise, the API itself would be easy to add. However, we have no ways to track the server load for the API usage on the latexonline.cc, which I don't feel comfortable about: We might need to detect and restrict malicious clients of the API to keep service generally available.

So one way to proceed would be to implement the API but forbid its usage on latexonline.cc until we come up with a sane way to identify API clients. This solution should satisfy your needs if you manage your own deployment of LatexOnline. What do you think?

MonsieurV commented 7 years ago

Speaking of being slow, I won't make any remark. :)

My scenario is :

I have an util application to edit and produce my quotes and invoices for my consulting business. It uses Yaml file as source files. They are combined by the application with Latex templates. Finally I need to produce the PDF files for the edited quote or invoice. That is here where an online Latex compiler like Latex-Online comes into line. It allows everyone to use the application. (without requiring to install a Latex distribution, which is always burdensome)

Returning to the implementation, it sure implies usage control. Actually, if we think about malicious clients, the API already exposes the /data method that could be abused. Actually, if there is no usage control, that the same deal for the /compile method.

Bottom line is: if a method is public, the is no reason for it not to be abused. The fact it has not a very friendly JSON API makes it just a potential victim of sabotage only (and not of proper heavy usage, which you may want to happen at a stage anyway).

To implement a fair-usage on this sort of API while preserving anonymous usage, you may need to implement a quota by IP. I'm not even sure that really efficient (and fair, for many people share the same public IP). Maybe a token based authentication, with the website as a way to control and limit the token emission. For e.g. you can put a Captcha when an IP has asked more than N anonymous token in 24h. A token could be used multiple-times, up to an arbitrary limit you defines. (based on the number of compilation, or the compilation time, or the file weight).

Now for my usage, I've had already drafted an API of this kind when I discover your service. It is now working for my need. (But it is very rudimentary, naive and not optimized at all. It does not implement any usage control.)

aslushnikov commented 7 years ago

Yeah, the scenario is clear. Thanks for sharing!

Actually, if we think about malicious clients, the API already exposes the /data method that could be abused. Actually, if there is no usage control, that the same deal for the /compile method.

That's fair! Though I have some of peace of mind: there's goaccess running to roughly estimate the /data endpoint usage, and there's googleanalytics to (mostly) keep an eye on /compile endpoint.

Maybe a token based authentication, with the website as a way to control and limit the token emission. For e.g. you can put a Captcha when an IP has asked more than N anonymous token in 24h. A token could be used multiple-times, up to an arbitrary limit you defines. (based on the number of compilation, or the compilation time, or the file weight).

Right, I think we'll end up with some kind of tokens for the API access.

Now for my usage, I've had already drafted an API of this kind when I discover your service. It is now working for my need. (But it is very rudimentary, naive and not optimized at all. It does not implement any usage control.)

Nice one!

yegor256 commented 5 years ago

@aslushnikov what is the status of this ticket? Would be great to have an ability to compile a package of documents, not just one.

aslushnikov commented 5 years ago

@yegor256 unfortunately no ETA for this

MonsieurV commented 5 years ago

@yegor256 if you'd like to try an API with multiple documents, you can look at Latex-On-HTTP

There is an open alpha of the API you can use https://latex.ytotech.com. I use it myself as an alternative of latex-online when I need something more Rest API oriented than CLI-oriented (latex-online works really great for that!).

Note the API may change and there may not be as much packages as you need.

Also may be an alternative for #51 #42.