highlightjs / highlight.js

JavaScript syntax highlighter with language auto-detection and zero dependencies.
https://highlightjs.org/
BSD 3-Clause "New" or "Revised" License
23.3k stars 3.52k forks source link

enh(gcode): rewrote language for moden gcode support #4040

Open barthy-koeln opened 2 months ago

barthy-koeln commented 2 months ago

Complete rework of the gcode language to allow for extended uses cases. Not all scope rules will apply to all implementations of g-code, but many applications of gcode have added a lot on top of the original spec.

This language implementation aims to be more extensive but still flexible.

My research has used the following documentations:

And countless code examples extracted from GitHub's search:

Changes

Question about code

I have used this pattern to re-use complex regexes:

const NUMBER = /[+-]?((\.\d+)|(\d+)(\.\d*)?)/;
const match = new RegExp(`(?<![A-Z])[FHIJKPQRS]\\s*${NUMBER.source}`)

I would like to use this for certain duplicated parts (e.g. (?<![A-Z])) as well. Is this acceptable? The performance impact is surely negligible since this only happens during initialisation. It greatly increases readability and IDE support in my opinion.

Screenshots

Name Before After
Default default_before default_after
Extended extended_before extended_after

Checklist

barthy-koeln commented 2 months ago

@joshgoebel thank you for your patience.

My latest commit added disableAutodetect: true to the language, as well as the aforementioned on:begin filter. I added the filter as a variant to full matches with \b.

This means that readable gcode with sane spacing will rarely, if ever, run into the callback filter.


I've tested both implementations with the existing markup test, as well as some 100lines of spaceless gcode. The results are always within 5% of each other, always in favor of the v12 language with lookbehinds.

Running Benchmark & Results #### Shell commands to download v12 and install [benny](https://github.com/caderek/benny) ```shell wget https://raw.githubusercontent.com/barthy-koeln/highlight.js/2c55db96e8a0523fdbdd6d4069c7007f75d5288b/src/languages/gcode.js -O src/languages/gcode-v12.js npm run build gcode gcode-v12 cd build npm install benny # create benchmark script here e.g. bench.mjs node bench.mjs ``` #### Benchmark script ```js import b from 'benny' import fs from 'fs' import hljs from 'highlight.js' const code = fs.readFileSync(import.meta.dirname + '/../test/markup/gcode/extended.txt').toString('utf-8') b.suite( 'gcode bench', b.add('v11', () => { hljs.highlight(code, { language: 'gcode' }) }), b.add('v12', () => { hljs.highlight(code, { language: 'gcode-v12' }) }), b.cycle(), b.complete(), b.save({ file: 'gcode', version: '1.0.0' }), b.save({ file: 'gcode', format: 'chart.html' }) ) ``` #### Results ![gcode-bench](https://github.com/highlightjs/highlight.js/assets/10552683/ffa219d3-eded-4d15-97f9-322af2684f1e)
barthy-koeln commented 1 month ago

Hey @joshgoebel, this is just a careful ping to see if I can or should do anything else here? I need this fix for a freelance project, but can use my fork for the initial release.

Edit: I seriously understand everyone's time is precious and this is free open-source work, so I'll accept any answer including "not now, will revisit later" :D