highlightjs / highlight.js

JavaScript syntax highlighter with language auto-detection and zero dependencies.
https://highlightjs.org/
BSD 3-Clause "New" or "Revised" License
23.31k stars 3.52k forks source link

Add operator detecting for c/cpp #3964

Closed TOMWT-qwq closed 1 month ago

TOMWT-qwq commented 5 months ago

Changes

joshgoebel commented 5 months ago

When you say symbol do you mean operator? If so we'd do this by first putting all the operators (including multi-character ones like !=) into an array then using our own regex.either function to build the actual regex.

github-actions[bot] commented 5 months ago

Build Size Report

Changes to minified artifacts in /build, after gzip compression.

7 files changed

Total change +573 B

View Changes | file | base | pr | diff | | --- | --- | --- | --- | | es/core.min.js | 8.17 KB | 8.17 KB | +2 B | | es/highlight.min.js | 8.17 KB | 8.17 KB | +2 B | | es/languages/arduino.min.js | 4.61 KB | 4.75 KB | +144 B | | es/languages/cpp.min.js | 2.59 KB | 2.73 KB | +140 B | | highlight.min.js | 8.2 KB | 8.21 KB | +2 B | | languages/arduino.min.js | 4.61 KB | 4.76 KB | +144 B | | languages/cpp.min.js | 2.6 KB | 2.73 KB | +139 B |
joshgoebel commented 5 months ago

I cleaned up the PR a bit to better match our coding standards... but I think the reason this hasn't been done before if the issue with <... sometimes it's an operator, sometimes it's the start of a template... so that case would need to be properly handled. (or maybe this will prove to be simpler in C++ since we only use it inside FUNCTION_TYPE_RE)... not sure.

Right now there are a ton of failing markup tests... you'll need to go thru them and see which are operator related (which will simply need to be updated) and which are template related.

There is a line (you can uncomment) in the test/markup test code that will just "replace" the test result files with whatever we generate - it can be helpful for updating result files but if you do this please proof all the changes by hand to guarantee they are as expected.

TOMWT-qwq commented 5 months ago

Nothing is matched according to the tests. Is regex.escape(x) working properly? I can't find its defination from the package downloaded here

joshgoebel commented 5 months ago

So sorry I forgot to push the main file that does the linking... if pull and build it should work now.

github-actions[bot] commented 5 months ago

Build Size Report

Changes to minified artifacts in /build, after gzip compression.

7 files changed

Total change +731 B

View Changes | file | base | pr | diff | | --- | --- | --- | --- | | es/core.min.js | 8.17 KB | 8.22 KB | +47 B | | es/highlight.min.js | 8.17 KB | 8.22 KB | +47 B | | es/languages/arduino.min.js | 4.61 KB | 4.76 KB | +151 B | | es/languages/cpp.min.js | 2.59 KB | 2.73 KB | +146 B | | highlight.min.js | 8.2 KB | 8.25 KB | +45 B | | languages/arduino.min.js | 4.61 KB | 4.76 KB | +150 B | | languages/cpp.min.js | 2.6 KB | 2.74 KB | +145 B |
TOMWT-qwq commented 5 months ago

ok I'll deal with those tests

tomwt-awa commented 5 months ago

Those tests are strong and I've found some bugs :)

Some of the failed tests are related to Arduino.

github-actions[bot] commented 5 months ago

Build Size Report

Changes to minified artifacts in /build, after gzip compression.

7 files changed

Total change +748 B

View Changes | file | base | pr | diff | | --- | --- | --- | --- | | es/core.min.js | 8.17 KB | 8.22 KB | +47 B | | es/highlight.min.js | 8.17 KB | 8.22 KB | +47 B | | es/languages/arduino.min.js | 4.61 KB | 4.76 KB | +154 B | | es/languages/cpp.min.js | 2.59 KB | 2.74 KB | +150 B | | highlight.min.js | 8.2 KB | 8.25 KB | +47 B | | languages/arduino.min.js | 4.61 KB | 4.77 KB | +154 B | | languages/cpp.min.js | 2.6 KB | 2.74 KB | +149 B |
tomwt-awa commented 5 months ago

I set its relevance to 0 to make autodetect working.

Finally I've dealt with most of them except the template <>

TOMWT-qwq commented 5 months ago

Is -> an operator?

tomwt-awa commented 5 months ago

I don't think it's a good idea to deal with these two cases in this PR. They are not related with operator.

Anyway let's finish this stage of my work. I'll deal with them in another two PRs.

joshgoebel commented 5 months ago

@allejo Any thoughts on whether it might be OK to just accept that <> as part of templates are going to be highlighted as operators (because it's too hard to distinguish) - for the increased fidelity of getting operators highlighted in general? I've long resisted this, but maybe in cases where it wouldn't break highlighting in general we should just embrace it?

I'll deal with them in another two PRs.

Let's see what allejo says. My concern is the template stuff may be very hard or even impossible to fix - so to accept this PR (separately) we'd have to be willing to accept that risk. So we need to decide we're open to that type of brokeness long-term or if we even want to still consider it broken.

github-actions[bot] commented 5 months ago

Build Size Report

Changes to minified artifacts in /build, after gzip compression.

9 files changed

Total change +1.03 KB

View Changes | file | base | pr | diff | | --- | --- | --- | --- | | es/core.min.js | 8.17 KB | 8.22 KB | +43 B | | es/highlight.min.js | 8.17 KB | 8.22 KB | +43 B | | es/languages/arduino.min.js | 4.61 KB | 4.76 KB | +158 B | | es/languages/c.min.js | 1.86 KB | 2 KB | +140 B | | es/languages/cpp.min.js | 2.59 KB | 2.74 KB | +154 B | | highlight.min.js | 8.21 KB | 8.25 KB | +43 B | | languages/arduino.min.js | 4.61 KB | 4.77 KB | +157 B | | languages/c.min.js | 1.87 KB | 2.01 KB | +139 B | | languages/cpp.min.js | 2.6 KB | 2.75 KB | +153 B |
github-actions[bot] commented 5 months ago

Build Size Report

Changes to minified artifacts in /build, after gzip compression.

9 files changed

Total change +1.04 KB

View Changes | file | base | pr | diff | | --- | --- | --- | --- | | es/core.min.js | 8.17 KB | 8.22 KB | +43 B | | es/highlight.min.js | 8.17 KB | 8.22 KB | +43 B | | es/languages/arduino.min.js | 4.61 KB | 4.77 KB | +160 B | | es/languages/c.min.js | 1.86 KB | 2.01 KB | +144 B | | es/languages/cpp.min.js | 2.59 KB | 2.75 KB | +156 B | | highlight.min.js | 8.21 KB | 8.25 KB | +41 B | | languages/arduino.min.js | 4.61 KB | 4.77 KB | +159 B | | languages/c.min.js | 1.87 KB | 2.01 KB | +143 B | | languages/cpp.min.js | 2.6 KB | 2.75 KB | +154 B |
tomwt-awa commented 5 months ago

I'm going to ask you what are we going to do next btw.

Any suggestion?

TOMWT-qwq commented 5 months ago

You know, a powerful part of hljs is contains: ["self"]. However, a regex is not that powerful. I've tried to do smth, but it doesn't work as expected. I don't think we can recognize everything clearly like a complier now.