[Feature] Support more languages for code block syntax highlighting

nomic-ai / gpt4all

GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.

https://nomic.ai/gpt4all

MIT License

70.66k stars 7.7k forks source link

[Feature] Support more languages for code block syntax highlighting #2191

Open trevorstr opened 7 months ago

trevorstr commented 7 months ago

Bug Report

I am running GPT4All on MacOS Sonoma 14.4.1. I asked the Mistral Instruct model to generate a sample PowerShell function. Each of the code blocks is prefixed with "perl" for some odd reason.

Here's another example of the bug:

Steps to Reproduce

Install GPT4All
Download and select the Mistrial Instruct model
Prompt with "Show me an example of a PowerShell function"

Expected Behavior

Code blocks are printed out, without the "perl" text as a prefix.

Your Environment

GPT4All version: v2.7.3
Operating System: MacOS Sonoma 14.4.1
Chat model used (if applicable): Mistrial Instruct

cosmic-snow commented 7 months ago

The 'perl' might just be part of the Markdown. GitHub supports that, too, so let me demonstrate:

```perl
print "Hello World\n";

is displayed as:
```perl
print "Hello World\n";

Note the colours, that's basically all it does.

So maybe the output of the model is triple backticks followed by 'perl' and the box doesn't know how to format that. The problem here is of course that it should be 'posh' or 'powershell' or something. So basically, I think the model is misbehaving.

Can you start fresh and regenerate the response a few times? Does it always happen?

Edit: Looks like neither PowerShell nor Perl have syntax highlighters yet. See here:

https://github.com/nomic-ai/gpt4all/blob/9c23d44ad32ee145300ca7a8f9711ace760523cb/gpt4all-chat/responsetext.cpp#L11-L24

trevorstr commented 7 months ago

Yeah I'm familiar with the syntax highlighting capability in GitHub, and similar platforms. I kinda figured the issue might be related to that. 😆

I re-opened the app and tried a couple more examples.

At least this time, the language is correct, but it's still not rendering it properly.

The same problem exists with Rust code blocks, too.

cosmic-snow commented 7 months ago

Right. So the problem you've brought up first (mismatch) was just the model giving bad responses.

As you can see in the edit I made, and if you follow the link to responsetext.cpp, there aren't any syntax highlighters for these three languages yet.

Maybe you want to turn this issue in a feature request for PowerShell, Rust and possibly Perl?