outline / rich-markdown-editor

The open source React and Prosemirror based markdown editor that powers Outline. Want to try it out? Create an account:
https://www.getoutline.com
BSD 3-Clause "New" or "Revised" License
2.87k stars 591 forks source link

Support dynamic loading of languages for code blocks #152

Open nicklayb opened 4 years ago

nicklayb commented 4 years ago

I've been a bit sad when I realized I could not write Elixir or Elm code snippet in Outline.

I didn't saw any issues about adding languages to the lib and I would like to do a PR to add those to the lib if you're good with it.

I know that the ultimate solution would've been to allow the user to specify supported languages but TBH, I don't know this could work and be efficient.

Thank you!

Edit: Makefile would be really appreciated too

tommoor commented 4 years ago

Hey Nicolas, thanks for the request – I think in order to support more languages we need to figure out how to dynamically load the language files into the editor, once that's in place then we can support many more languages without increasing the initial download size for everyone.

I'll add these to the list of languages to add once that piece is complete!

tCm3nc commented 4 years ago

Is there a starting point you can point to? I spent some time trying to import some documentation from another wiki instance but it would not parse cleanly.

I'd be interested in investigating if there can be a solution to dynamically loading languages.

tommoor commented 4 years ago

I spent some time trying to import some documentation from another wiki instance but it would not parse cleanly.

I'm not sure what you mean – this sounds like a separate issue? You'll need to provide the markdown that "wouldn't parse cleanly" if anything is to be fixed.

tCm3nc commented 4 years ago

Hi Tom - no it was the same issue and related as the markdown I was trying to import (via cut and paste) had code blocks with languages defined. e.g. ```sh foo ```

Because outline supports a smaller subset, of which sh isn't included, when I tried to save my draft I got an empty page.

If you can point me in the direction of the syntax highlighting code for outline / the markdown editor I could try and figure out if dynamically loading a language can be supported Thanks 👍

tommoor commented 4 years ago

when I tried to save my draft I got an empty page.

That's interesting – sounds like a bug regardless. Code is here: https://github.com/outline/rich-markdown-editor/blob/master/src/nodes/CodeFence.ts https://github.com/outline/rich-markdown-editor/blob/master/src/plugins/Prism.ts

tCm3nc commented 4 years ago

Hi Tom - Thanks for the direction, I've been reading around the topic for a bit. I came across this project react-refractor that talks about dynamically loading a language definition. The process involves using Webpacks' code-splitting abilities to import modules dynamically, specifically the documentation on dynamic imports might be more applicable here.

Would this approach be the correct one for this project? Would it complicate things if there were several foo-language.js files scattered about?

For an offline / self-hosted scenario it would gain little as you could save a lot of time and effort by loading all supported languages for refractor. However, I can see the benefit for an online hosted scenario with many different clients.

BrianHung commented 4 years ago

Here's an example of how I managed to get dynamic importing of languages for TipTap (I used CodeMirror for syntax highlighting though, not prism). What happens is that we first try to apply syntax highlight to all codeblocks, keep track of which languages haven't been imported yet, import those languages with promises, and then dispatch a transaction to re-apply syntax highlighting when promises resolve.

tommoor commented 4 years ago

Would this approach be the correct one for this project?

Yea, I think as Brian's example we need to use dynamic imports and rely on webpack to do the code splitting.,

For an offline / self-hosted scenario it would gain little as you could save a lot of time and effort by loading all supported languages for refractor.

I don't think it matters of it's self hosted, the performance gain when loading the editor is the same 😄

manuschillerdev commented 3 years ago

@tommoor this should be fairly simple to achieve :)

  1. remove static imports and refractor.register calls
  2. dynamically import the syntax on language changes

Besides that: I would love have the option to get rid of the dropdown, and just type three backticks and the language key.

    ```typescript

Would that be possible to match? This way we would support all supported prism syntaxes :)

Diff + Demo:

diff --git a/src/nodes/CodeFence.ts b/src/nodes/CodeFence.ts
index 5574edc..41be4bc 100644
--- a/src/nodes/CodeFence.ts
+++ b/src/nodes/CodeFence.ts
@@ -1,20 +1,4 @@
 import refractor from "refractor/core";
-import bash from "refractor/lang/bash";
-import css from "refractor/lang/css";
-import clike from "refractor/lang/clike";
-import csharp from "refractor/lang/csharp";
-import go from "refractor/lang/go";
-import java from "refractor/lang/java";
-import javascript from "refractor/lang/javascript";
-import json from "refractor/lang/json";
-import markup from "refractor/lang/markup";
-import php from "refractor/lang/php";
-import python from "refractor/lang/python";
-import powershell from "refractor/lang/powershell";
-import ruby from "refractor/lang/ruby";
-import sql from "refractor/lang/sql";
-import typescript from "refractor/lang/typescript";
-import yaml from "refractor/lang/yaml";
 import { setBlockType } from "prosemirror-commands";
 import { textblockTypeInputRule } from "prosemirror-inputrules";
 import copy from "copy-to-clipboard";
@@ -23,25 +7,6 @@ import isInCode from "../queries/isInCode";
 import Node from "./Node";
 import { ToastType } from "../types";

-[
-  bash,
-  css,
-  clike,
-  csharp,
-  go,
-  java,
-  javascript,
-  json,
-  markup,
-  php,
-  python,
-  powershell,
-  ruby,
-  sql,
-  typescript,
-  yaml,
-].forEach(refractor.register);
-
 export default class CodeFence extends Node {
   get languageOptions() {
     return Object.entries(LANGUAGES);
@@ -149,13 +114,22 @@ export default class CodeFence extends Node {
     }
   };

-  handleLanguageChange = event => {
+  handleLanguageChange = async event => {
     const { view } = this.editor;
     const { tr } = view.state;
     const element = event.target;
     const { top, left } = element.getBoundingClientRect();
     const result = view.posAtCoords({ top, left });

+    if (!refractor.registered(element.value)) {
+      try {
+        const syntax = await import(`refractor/lang/${element.value}`);
+        refractor.register(syntax.default);
+      } catch (e) {
+        console.error(`Error while trying to import ${element.value}`);
+      }
+    }
+
     if (result) {
       const transaction = tr.setNodeMarkup(result.inside, undefined, {
         language: element.value,

https://user-images.githubusercontent.com/56154253/115959386-5047b780-a50c-11eb-9d7a-8cbec77ebb61.mov

BrianHung commented 3 years ago

Two caveats with working with asynchronous code w.r.t. ProseMirror I see here:

  1. Need to move this.editor after the async call: this is to avoid a mismatched transaction error, because when dispatching transactions, you want to work with the current view.state. Another problem if the mismatched transaction isn't encountered is a worse case scenario is if the import takes a long time, and the user moves the codeblock element down (or up) by inserting (or deleting) a paragraph above it: then result.inside will not correspond to the codeblock anymore after the async call finishes.
handleLanguageChange = async event => {
  const element = event.target; 

  if (!refractor.registered(element.value)) {
    try {
      const syntax = await import(`refractor/lang/${element.value}`);
      refractor.register(syntax.default);
    } catch (e) {
      console.error(`Error while trying to import ${element.value}`);
    }
  }

  if (result) {
    const { view } = this.editor;
    const { tr } = view.state;
    const { top, left } = element.getBoundingClientRect();
    const result = view.posAtCoords({ top, left });       
    const transaction = tr.setNodeMarkup(result.inside, undefined, {
      language: element.value,
  1. If you want the triple backtick inputrule for codeblocks, the handleLanguageChange is a bit awkward because you can't asynchronously create a node via toDOM. You could do something hacky: create the codeblock with the javascript default, then use chained promises to call setNodeMarkup because you can dispatch promises in toDOM (but not await them). A cleaner abstraction though would be to use the plugin system via the Prism plugin existing already, and shove all async logic into there. Using the apply method, you can detect when new codeblocks are added and when codeblock languages change.
tommoor commented 3 years ago

I don't see any way that including a variable in the require statement can work, the bundlers need to know what code can be required so that chunks can be created.

Using the autoloader plugin might be the way to go, with that setup you'd have an additional editor prop to tell the frontend where to load language files from: https://prismjs.com/plugins/autoloader

However backwards compatibility would be tricky without and it wouldn't be realistic to make that sort of prop a requirement 🤔

manuschillerdev commented 3 years ago

I don't see any way that including a variable in the require statement can work, the bundlers need to know what code can be required so that chunks can be created.

Since it's a dynamic import statement and not a require, variables in the path do work:

I don't know about much about solving the other implications that @BrianHung mentioned, since I don't have much experience with ProseMirror, yet.

tommoor commented 3 years ago

That's great, @BrianHung's concerns with your spike spot on but pretty easily resolved