PrismJS / prism

Lightweight, robust, elegant syntax highlighting.
https://prismjs.com
MIT License
12.27k stars 1.29k forks source link

Support auto detecting language #1313

Closed paladox closed 5 years ago

paladox commented 6 years ago

Hi, could support for auto detecting the language be added please?

Also would it be possible to support this type of language

var html = Prism.highlight(code, language);

(language as in mime types php, etc).

Like highlight.js usage. As they are now unmaintained, prime.js is a better alternative but needs support for Prism.highlight(code, lanaguage); so that it auto loads the language it needs.

uhafner commented 6 years ago

Is there at least a mapping of file extensions to class names available? I would like to use it in an application where the actual visualized file is not known in advance.

Golmote commented 6 years ago

@uhafner Take a look at the File Highlight plugin.

uhafner commented 6 years ago

Thanks, that did the trick!

HarryCaveMan commented 6 years ago

Is there some where to view the dependency graph of the language packs? A simple way to get auto detection support for all languages (without importing them all) is to import the languages that are high on the dependency tree (i.e. clike, c, java,html, javascript) every time, then load the extension classes dynamically from the input (i.e import('prismjs/components/prism-'+inputLang)), but , aside from reading/testing them all or seeing the dependency graph, it is hard to determine which language packs to load initially. This may not be a good solution if there are lots of nested dependencies because you'd still end up importing a lot of stuff for every instance. I will attempt to fork and do some testing in a few weeks if no one else has time. Here is an example react-markdown plugin that (kind of) adds support automatically (works for at least 20 or so langs), and works with real-time previews. It defaults to JavaScript if no support is found or if deps are missing:

import React from "react";
import Prism from "prismjs/components/prism-core";
//other languages depend on these
import "prismjs/components/prism-clike";
import "prismjs/components/prism-c";
import "prismjs/components/prism-java";
import "prismjs/components/prism-html";
//include javascript as default fallback
import "prismjs/components/prism-javascript";

let CodeBlock = {
  Block (props){
    let html;
    let cls;
    //console.log(props.value)
    try{
      //try to load prism component for language
      import("prismjs/components/prism-"+props.language);
      html = Prism.highlight(props.value ||"...", Prism.languages[props.language]);
      cls = `language-${props.language}`;
    }
    catch(er){
      //if load failed, fall back to javascript
      console.log(er.message+": \""+props.language+"\"");
      html = Prism.highlight(props.value||"...", Prism.languages["js"]);
      cls = "language-js";
    }    
    return (
      <pre className={cls}>
        <code
          dangerouslySetInnerHTML={{__html: html}}
          className={cls}
        />
      </pre>
    );
  },
  InLine(props) {
      let html = props.value;
      let cls = "language-js";    
      return (
          <code
            dangerouslySetInnerHTML={{__html: html}}
            className={cls}
          />
      );
  }
};

export default CodeBlock;
Golmote commented 6 years ago

I don't get how this would solve the issue of auto-detection. AFAIK highlight.js does it by testing the code (or part of it?) with every language. Each test returns a relevance score, based on specific caracteristics of the language (mainly keywords, but also special syntaxes). Sometimes a test can return early with a relevance of 0 if it detects an invalid syntax.

This is a smart approach, but it still requires to test every language and it's something I'm not happy about. Yet, Prism could probably support this by adding these concepts (relevancy and invalid syntaxes).

HarryCaveMan commented 6 years ago

Trying to train some sort of statistical classifier with a training set could be a viable option. Something like this , which can train a model from samples and then just load the trained model to detect.

izelnakri commented 6 years ago

Any updates on this? This would really enhance the developer experience if the code block doesn't have a language class.

mAAdhaTTah commented 6 years ago

Honestly, I think this is unlikely to be supported / implemented by the core team, but we'd accept a PR for a plugin.

asbjornu commented 5 years ago

highlight.js supports language auto-detection. Perhaps some inspiration can be drawn from there?

RunDevelopment commented 5 years ago

As @Golmote said before: Highlight.js' approach while quite inefficient might be usable.

But to support illegal tokens and relevance, we would probably have to adjust every language definition.

mAAdhaTTah commented 5 years ago

Not gonna lie, 4 downvotes from people who want others to do work for things they want is frustrating. As stated, if this is desired, we will happily accept a PR, but we will not be implementing this.

andrewjmead commented 2 years ago

If you're interested, you can use the auto detect from highlight.js but then use the syntax highlighting from prism. This loads highlight.js and runs the highlighter just to generate the correct language-xxxx class that prism can then pick up on.

<script src="//cdnjs.cloudflare.com/ajax/libs/highlight.js/11.5.1/highlight.min.js"></script>
<script>
    document.addEventListener("DOMContentLoaded", function () {
        hljs.highlightAll();
    })
</script>

<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/prism/1.28.0/themes/prism.min.css" integrity="sha512-tN7Ec6zAFaVSG3TpNAKtk4DOHNpSwKHxxrsiw4GHKESGPs5njn/0sMCUMl2svV4wo4BK/rCP7juYz+zx+l6oeQ==" crossorigin="anonymous" referrerpolicy="no-referrer" />
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/prism/1.28.0/plugins/line-numbers/prism-line-numbers.min.css" integrity="sha512-cbQXwDFK7lj2Fqfkuxbo5iD1dSbLlJGXGpfTDqbggqjHJeyzx88I3rfwjS38WJag/ihH7lzuGlGHpDBymLirZQ==" crossorigin="anonymous" referrerpolicy="no-referrer" />
<script src="https://cdnjs.cloudflare.com/ajax/libs/prism/1.28.0/components/prism-core.min.js" integrity="sha512-9khQRAUBYEJDCDVP2yw3LRUQvjJ0Pjx0EShmaQjcHa6AXiOv6qHQu9lCAIR8O+/D8FtaCoJ2c0Tf9Xo7hYH01Q==" crossorigin="anonymous" referrerpolicy="no-referrer"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/prism/1.28.0/plugins/autoloader/prism-autoloader.min.js" integrity="sha512-fTl/qcO1VgvKtOMApX2PdZzkziyr2stM65GYPLGuYMnuMm1z2JLJG6XVU7C/mR+E7xBUqCivykuhlzfqxXBXbg==" crossorigin="anonymous" referrerpolicy="no-referrer"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/prism/1.28.0/plugins/line-numbers/prism-line-numbers.min.js" integrity="sha512-BttltKXFyWnGZQcRWj6osIg7lbizJchuAMotOkdLxHxwt/Hyo+cl47bZU0QADg+Qt5DJwni3SbYGXeGMB5cBcw==" crossorigin="anonymous" referrerpolicy="no-referrer"></script>

Wrote about it here: https://mead.io/2022/06/09/wordpress-post-syntax-highlighting-with-highlightjs-and-prism/

abulka commented 1 year ago

A tip to those using the code above: ensure the two script tags involving the highlight.js library precede the prism link and script tags, just as the example shows, otherwise prism doesn't run at the correct time.

Whilst the technique works, I'm not that impressed with the highlight.js library's detection of languages:

I'm building a file preview tool, I wish there was some way of passing the file extension which would be a massive hint as to which language to switch to.

Nantris commented 2 months ago

@andrewjmead that sure seems expensive on multiple levels. I don't understand why Prism can't implement a language guesser like highlight.js does.

masylum commented 3 weeks ago

I'm using https://github.com/teknologi-umum/flourite