adam-p / markdown-here

Google Chrome, Firefox, and Thunderbird extension that lets you write email in Markdown and render it before sending.
http://markdown-here.com
MIT License
59.64k stars 11.26k forks source link

Switch TeX renderer to CodeCogs #261

Open adam-p opened 9 years ago

adam-p commented 9 years ago

The Google Image Charts API has been deprecated and could disappear at any time. This will wreck the rending of MDH math images. See discussion of alternatives in issue #144.

In the short term (and probably long term) we are going to switch to using CodeCodgs.

cben commented 9 years ago

[picking up from #144]

In exchange for that effort, I'll offer you this data point and possible solution to our woes: If you render to GIF instead of PNG (so, gif.latex), and you use \inline, the letters look... perfect.

I'm now skilled at finding a problematic example by typing into editor with any setting :-) [though I'm yet to find any with default dpi]

P vs Q

Will be back after I post the bug and some ideas about baselines on CodeCogs forums.

cben commented 9 years ago

BTW, I thougt Charts API will be gone around now but their deprecation notice has changed to:

Important: While the dynamic and interactive Google Charts are actively maintained, we officially deprecated the static Google Image Charts way back in 2012. This gives us the right to turn it off without notice, although we have no plans to do so.

cben commented 9 years ago

Reported at http://www.codecogs.com/pages/forums/pagegen.php?id=2870

A new gem is simultaneous under- and over- sized gifs: http://latex.codecogs.com/gif.latex?%5Cdpi%7B200%7D%20z z http://latex.codecogs.com/gif.latex?%5Cdpi%7B200%7D%20a a (this is the normal size) http://latex.codecogs.com/gif.latex?%5Cdpi%7B200%7D%20N N :-)

cben commented 9 years ago

CodeCogs responded to that report in May 2015. Back then I saw it became much better but I still caught a couple cases. I now tried again and couldn't find any problems. Have you tried / seen any problems since then?

OTOH, google's deprecation notice has changed to this:

While the dynamic and interactive Google Charts are actively maintained, we officially deprecated the static Google Image Charts way back in 2012. This gives us the right to turn it off without notice, although we have no plans to do so.

so there is little pressure.

cben commented 5 years ago

FWIW there is a new notice saying:

Warning: This API is deprecated in 2012 and was turned off on March 18, 2019. Please use the actively maintained Google Charts API instead.

However the linked thread later says on March 20:

The API was turned back on temporarily and may be shut off again without notice.

:laughing: and https://chart.googleapis.com/chart?cht=tx&chl=\displaystyle\sum_{n=1}^\infty%20\frac{1}{n} works for me as of 2019-03-16.

Possibly they realized people need more warning, but they're absolutely not committing to any specific time it'll stay up, and clearly do intend to sunset it eventually...

cben commented 5 years ago

Other options:

@parpalak, thanks for the high-quality project!
Do you think it'd be reasonable to default Markdown Here browser extension to use tex.s2cms.com? Or do you recommend CodeCogs?

The extension's main use is probably for composing emails, but it's also usable on multiple blog / CMS sites. It's important for the math to be readable years after the documents were created / sent — if you fear you'll to turn it off / block, it's better for MDH to choose CodeCogs now. It's hard to estimate the extra load you'd get from this but basically, it'll keep increasing with time, and will be spread across many web sites. @adam-p can you give any order of magnitude on MDH usage (and specifically math usage)?

P.S. the math image URLs generated by Markdown Here are configurable, and individuals will be able to point it to tex.s2cms.com anyway. The question is about defaults.

cben commented 5 years ago
uetchy commented 5 years ago

@cben How many requests should I expect from your app each day? I've been hosting API on "pay-as-you-go" platform so that number is so important to me😁

parpalak commented 5 years ago

@cben thanks for your question. However, if it's hard to estimate, it's hard for me to answer :)

I think the additional load would not be significant since Markdown (and Tex) is a little bit geeky. I'm still not going to close the service, as FAQ says.

cben commented 3 months ago

Well I tried MDH again and found the Google API is finally and completely dead.
At least let's switch default to something like

<img src="https://latex.codecogs.com/png.image?\dpi{120}\inline&space;{urlmathcode}" alt="{mathcode}">

I looked again into baseline alignment and found a dirty-but-pragmatic kludge: pad the images vertically inside TeX so we can simply center them all. After much research (https://tex.stackexchange.com/a/722922/7262, https://www.overleaf.com/read/tmhcwxwdntnh#f8a862), the way that works (so that the renderer does not auto-crop the padding) is adding paired delimiters and coloring them white:

<img class="math" src="https://latex.codecogs.com/png.image?\inline&space;\dpi{120}\bg{white}\color{white}\left|\color{black}{urlmathcode}\color{white}\right|" alt="{mathcode}" align="middle">

or

<img class="math" alt="{mathcode}" src="https://i.upmath.me/png/%5Cinline%5Ccolor%7Bwhite%7D%5Cleft%7C%5Ccolor%7Bblack%7D{urlmathcode}%5Ccolor%7Bwhite%7D%5Cright%7C" align="middle" style="vertical-align: middle;">

vertical-align support even in email seems good.

Looking like this: gmail screenshot

As you see, this approach inherently tends to waste vertical space unnecessarily. The resolution also doesn't look well on modern high-DPI screens. But still an improvement?


To improve on resolution, would want to resize each image in JS, and if you're getting into that it's maybe worth getting actual baseline from renderer.

adam-p commented 1 week ago

Well, this terribly overdue to be done.


The GmailTeX extension uses the Wordpress TeX renderer. It's blurry by default, but I haven't evaluated it beyond a simple glance. It seems to be part of their paid service, and I have strong doubts that us using it by default would be what they want/intend.

<img src="https://s0.wp.com/latex.php?zoom=3&bg=ffffff&fg=000000&s=0&latex={urlmathcode}" alt="{mathcode}">

@uetchy's renderer produces crisp output.

<img src="https://math.vercel.app/?bgcolor=auto&from={urlmathcode}.png" alt="{mathcode}">

Five years ago they asked:

How many requests should I expect from your app each day? I've been hosting API on "pay-as-you-go" platform so that number is so important to me

The answer is: We don't know. There are about 100,000 Markdown Here users. I would guess fewer than 5% use the TeX math feature. And how many do those people send each day? I don't know. I suspect that you'd be rendering at most 5,000 of our images each day, and probably much lower. But MDH doesn't collect any usage/telemetry, so we don't really know.

Unless uetchy is totally on-board, we're obviously not going to mess with their server bill.


This 2022 post suggests that email support for SVG is just as bad as it was 5 years ago.


Looking briefly at MathJax's CommonHTML suggests that they use custom HTML elements. That certainly won't work with email.


I briefly looked around with KaTeX again. As @cben said years ago, we'll have to figure out how to inline the styles. I feel like that might not be so hard programmatically, but a) cben encountered difficulty with that last time, and b) it's going to make for significant rendered HTML bloat (even more than MDH already suffers).


So, CodeCogs still seems like the best bet, right now.

@cben I'm probably not going to try to figure out the baseline stuff right now. I don't remember it well, and this really needs to get fixed. (And, frankly, I need to do some quick-and-easy work on MDH to get back into remembering how to do it.)

parpalak commented 1 week ago

Hello!

What would an ideal API for integration look like for you, and what should it output? If a small amount of work is required, I can update my service, https://i.upmath.me/, to provide such an API.

Also, I think you could convert SVG to PNG directly in your extension using the canvas element, while also scaling for high-density screens. Here’s a minimal example generated by ChatGPT: https://codepen.io/parpalak/pen/vYopPxB

cben commented 1 week ago

The trouble with generating images in the extension (which is possible without servers at all!) is what to do with it then.

In principle MDH targets many sites; on most, an inline data: URL would probably work. But email (esp. gmail) is both a major use case for MDH, and a worst-case of support :-( Including images in the mail would obviously be way better than refering to external servers! Alas, data URLs in emails are still very badly supported; there what you need is an actual email attachment (of particular cid flavour that renders inline), but @adam-p said he never figured out how to make gmail.com JS do that. Simply pasting an image into their contenteditable does NOT add an attachment — it looks good on editing but when you send, the image is not there.

So, alas, MDH only knows how to use href= some external server :-(

adam-p commented 4 days ago

@cben is correct about client-side rendering, but when he said "Simply pasting an image", it prompted me to look at how Gmail handles pasting. I've often pasted images (e.g., screenshot fragments) into Gmail and they've been handled properly.

It looks like Gmail a) uploads the image, and b) creates a blob URL that it puts into an img, like <img src="blob:https://mail.google.com/<guid>" data-surl="cid:<random>">. That URL points at a PNG. (The URL in src isn't actually remote; it's generated by URL.createObjectURL and actually fetches from local memory.) So I tried creating a blob URL in JS and sticking it in a compose body. The image looks okay when composing but is absent when sending. What's probably happening is that the upload is providing the content ID ("cid"), which is the "real" image, and the blob URL is just for local display. The one I'm creating doesn't have a proper cid that refers to a remote object, so it just disappears.

Maybe we could hack in the events or upload requests that are required to store the image, but that seems like folly.

What we really want (for some definition of "want") is to do a "real" paste -- as if the user hit ctrl+v. This is something I investigated years ago in the context of pasting the whole rendered content into Google Docs.

Generally speaking, the clipboard API is for writing to and reading from the clipboard -- and then creating an element with the contents -- but what we need is to simulate a real user-initiated paste, so we get the upload, cid, etc. And... it seems that we can do that with document.execCommand("paste"). This requires more permissions, is marked as deprecated, and is generally discouraged, but it nominally works. In background script code, I had it put PNG data into the clipboard and then paste it into a Gmail message in such a way that the image survived sending. So that's cool.

This made me think of a maybe-feasible-but-unpleasant idea: During rendering, either skip the math instances or add placeholders for them; insert the rendered HTML into the DOM. Then do a second pass, positioning the cursor at the math locations and pasting the images. Not impossible, but sure adds some complexity.

Then I noticed that navigator.clipboard.write takes an array of ClipboardItems and wondered if we could paste the images and HTML all at once. But Chrome errors when you pass more than one item in the array, and googling suggests that's a limitation.

But then... I tried pasting an image that's in <img src="data:..."> along with other HTML and it worked -- the image got a cid, etc. And that means...

Okay, short refresher: Right now when we render Markdown containing TeX, we turn the MD into HTML and the TeX into a URL to an image generated by CodeCogs (or broken Google Charts, ahem), which itself is put into the HTML. We also create an invisible div with an attribute that stores the original MD (for de-rendering), which we append to the rest of the HTML. We then put all that HTML inside a wrapper div. Then we delete the raw MD contents from the DOM, and then insert our rendered HTML into the DOM under the contenteditable element. Also of note is that we inline all of the styles on all of the elements in the HTML, because email is really bad about style sheets.

But what if, instead of that, we generated the TeX images locally, stuck them inline in the HTML, created the de-render and wrapper divs, loaded that into the clipboard, and then pasted it at the correct spot (after deleting the original)?

Thoughts:

  1. This might work.
  2. This is a really big change. There will need to be a lot of testing on different sites to prove to ourselves that we haven't broken every site except Gmail (kind of thing). It would need to be released as a beta, or -- probably better -- behind a new flag in the options allowing people to opt in to try it.
  3. This is a lot of work for a feature that benefits only a tiny fraction of users. (Sometimes I wish I had telemetry, so I knew what "tiny fraction" really meant.)

I haven't yet done the obvious testing of doing this exact thing. But I will.

Any thoughts on the idea?