joeyespo / grip

Preview GitHub README.md files locally before committing them.
MIT License
6.42k stars 422 forks source link

Add Math support #369

Open Antonio-R1 opened 1 year ago

Antonio-R1 commented 1 year ago

Add support for rendering mathematical expressions as requested in #362 with MathJax. You can activate it with the --render-math option.

In the screenshot below the examples from https://github.blog/2022-05-19-math-support-in-markdown/ and https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/writing-mathematical-expressions were tested: screenshot

The MathJax script is loaded from a CDN from the url defined in the grip/constants.py file. If you want, I can also add a command line option for changing the url.

Antonio-R1 commented 1 year ago

I added a new --math-jax-url option for loading the MathJax library from a different URL. The value of the math_jax_url parameter can now also be configured in the ~/.grip/settings.py file for using a different URL as described in https://docs.mathjax.org/en/latest/web/start.html.

Then, I have also modified the configuration of MathJax since on GitHub inserting hyperlinks in mathematical expressions seem to be blocked. I also tried to simulate how $$ are handled when the code block syntax is used.

However, there are still some differences between the rendered documents on GitHub and the output of the --render-math option: For example, the expressions a $1 b $2 and a 1$ b 2$ are rendered as mathematical expressions when grip is used with the --render-math option, but they should be rendered as text. On GitHub, the mathematical expressions also seem to be further sanitized for avoiding any potential XSS.

@joeyespo In the current implementation from this pull request, the MathJax library is added from an external URL. However, GitHub uses a custom math-renderer HTML element, which we can also obtain with grip by using the --user-content option. I created the following script for showing that by adding two script tags the mathematical expressions in the math-renderer HTML elements from the output of grip, when the --user-agent option is used, can be rendered and we can avoid the problems from above.

Example: grip --user-content example.md --export - | python3 ./script.sh


from selenium import webdriver
from selenium.common.exceptions import JavascriptException
from selenium.webdriver.chrome.options import Options
import re
import requests
import sys
import time
import urllib.parse

"""
Reads the output of grip used with the "--user-content" option and injects "script" elements
for rendering the mathematical expressions inside the "math-renderer" custom element.
"""

options = Options()
#options.add_argument('--headless')
#options.add_argument('--disable-gpu')

driver = webdriver.Chrome(options=options)
html = urllib.parse.quote(sys.stdin.read())

# https://stackoverflow.com/questions/695151/data-protocol-url-size-limitations/41755526
driver.get("data:text/html;charset=utf-8,%s" %html)

SCRIPT_MATH_RENDERER="""

let scriptElementSrcArray = [%s, %s];

for (let scriptElementSrc of scriptElementSrcArray) {
   scriptElement = document.createElement("script");
   scriptElement.setAttribute("src", scriptElementSrc);
   document.head.appendChild(scriptElement);
}

"""

r = requests.get('https://github.com/joeyespo/grip')
if not 200 <= r.status_code < 300:
   print ("warning: status code %d" %r.status_code)

content = r.text

WP_RUNTIME_SCRIPT_RE = "\"https://github.githubassets.com/assets/wp-runtime-[0-9a-zA-Z]*\.js\"|$"
ELEMENT_REGISTRY_SCRIPT_RE = "\"https://github.githubassets.com/assets/element-registry-[0-9a-zA-Z]*\.js\"|$"

wp_runtime_script = re.findall(WP_RUNTIME_SCRIPT_RE, content)[0]
element_registry_script = re.findall(ELEMENT_REGISTRY_SCRIPT_RE, content)[0]

try:
   driver.execute_script(SCRIPT_MATH_RENDERER %(wp_runtime_script, element_registry_script))
except JavascriptException as e:
   print (e)

STRING_DISCONNECTED = 'Unable to evaluate script: disconnected: not connected to DevTools'
while (len(driver.get_log('driver'))==0 or
       driver.get_log('driver')[-1]['message']==STRING_DISCONNECTED):
   time.sleep(1)

driver.quit()

Since the --render-math option adds a dependency to an external URL and grip has an --export option, that by default inlines styles, I wrote the Python script below for inlining the mathematical expressions as SVG images with MathJax loaded from an external URL. The script below has the same problems as the --render-math option as described above, since it does not use the math-renderer HTML elements. The --render-math option has not to be used when executing grip with the script below.

Example: grip example.md --export - | python3 script.py > example.html

I think the export function in api.py could be extended to inline the mathematical expressions as SVG images, when grip is executed with the --export option and without the --no-inline option.


from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import sys
import urllib.parse

"""
Reads the output of grip and substitutes the mathematical expressions with SVG images
"""

options = Options()
options.add_argument('--headless')
options.add_argument('--disable-gpu')

driver = webdriver.Chrome(options=options)
html = urllib.parse.quote(sys.stdin.read())

# https://stackoverflow.com/questions/695151/data-protocol-url-size-limitations/41755526
driver.get("data:text/html;charset=utf-8,%s" %html)

SCRIPT_MATH_JAX_SVG="""
window.doneCallback = arguments[arguments.length-1];
window.scriptElements = [];

var scriptElement = document.createElement("script");
var mathJaxConfig = `
var MathJax = {
  loader: {load: ['ui/safe']},
  tex: {
    inlineMath: [['$', '$']],
    displayMath: [['$$', '$$']],
    packages: {
      '[-]': ['html', 'newcommand', 'require']
    }
  },
  options: {
    safeOptions: {
      allow: {
        URLs: 'none',
        classes: 'none',
        cssIDs: 'none',
        styles: 'none'
      }
    }
  },
  startup: {
    pageReady: () => {
      return MathJax.startup.defaultPageReady().then(() => {
        for (let scriptElement of window.scriptElements) {
          document.head.removeChild(scriptElement);
        }
        let scriptElements = document.getElementsByTagName("script");
        for (let i=scriptElements.length-1; i>=0; i--) {
           let e = scriptElements[i];
           let srcAttribute = e.getAttribute("src");
           if (srcAttribute) {
              e.parentElement.removeChild(e);
           }
        }
        window.doneCallback();
      });
    }
  }
}`;
try {
  scriptElement.appendChild(document.createTextNode(mathJaxConfig));
} catch (e) {
  scriptElement.text = code;
}

document.head.appendChild(scriptElement);
window.scriptElements.push(scriptElement);

scriptElement = document.createElement("script");
scriptElement.setAttribute("src", "https://cdnjs.cloudflare.com/ajax/libs/mathjax/3.2.2/es5/tex-svg.min.js");
document.head.appendChild(scriptElement);
window.scriptElements.push(scriptElement);

function typesetMath() {
var preElements = document.getElementsByTagName('pre');
  for (let p of preElements) {
    if (p.lang==="math") {
      p.style.backgroundColor = "white";
      p.classList.add("mathjax_process");
      let codeElements = p.getElementsByTagName('code');
      if (codeElements.length===1) {
        let c = codeElements[0];
        c.classList.add("mathjax_process");
        let index = c.innerHTML.indexOf("$$");
        if (index>=0) {
          c.innerHTML = c.innerHTML.substring(0, index);
        }
        c.innerHTML = "$$"+c.innerHTML+"$$";
      }
    }
  }
}

typesetMath();
"""

driver.execute_async_script(SCRIPT_MATH_JAX_SVG);
print(driver.page_source)
driver.quit()

Edit Now, the math-renderer custom HTML element is supported if the --render-math and --user-content options are used together. The JavaScript files used for rendering the mathematical expressions are also cached as it is already done with the CSS files.

grip uses the GitHub Markdown API in raw mode if the --user-content option is not used in which according to the documentation GitHub Flavored Markdown is not supported.

Therefore, if the --render-math option is used without the --user-content option, we do not get the math-renderer HTML elements from the API and the MathJax library is loaded from the math_jax_url parameter as a fallback. Furthermore, if the --export option is used in the current version of this pull request, the MathJax library is also loaded from the math_jax_url parameter instead of using the math-renderer HTML element.

thesofakillers commented 1 year ago

@joeyespo any thoughts on merging and releasing this PR?

kaimast commented 1 year ago

This works fine for me. Merge, please? :)

itxasos23 commented 1 year ago

This works fine for me as well. I'm using the branch to take college notes and works like a charm.

Having this merged in main would be amazing. Can we merge?

Curve commented 1 year ago

This works really well!
Would be nice if it were merged soon ^^

cqc-alec commented 1 year ago

I'd also like to see this merged! However it appears that one of the tests is failing (test_app in tests/test_api.py). Looks like this should be easy to fix.