TelegramMessenger / libprisma

Code highlight tokenizer written in C++
MIT License
45 stars 13 forks source link

[question] Why bsl language is not supported? #3

Open nixel2007 opened 10 months ago

nixel2007 commented 10 months ago

Hello! I've noticed that BSL language is excluded. Could you explain why? Does it use some feature that is not supported in libprisma? We could try to adjust the grammar to make it work with your lib.

Thanks in advance!

exclued commented 10 months ago

Seems it have something to do with unsupported UTF-16 characters in the language grammar.

function sanitize(pattern) { // Unsupported: // UTF-16 ranges from generate.js

Compare this with BSL definition in https://github.com/PrismJS/prism/blob/master/components/prism-bsl.js

nixel2007 commented 10 months ago

Thanks, I'll take a look. For most cases these utf-16 sequences can be simplified to [а-яё], which are in utf-8 range

nixel2007 commented 10 months ago

prism.js v1 does not merge any PRs these days. Will you approve a .patch file with patch to bsl grammar and additions to generate.js/github workflow to apply the patch it at place?

nixel2007 commented 9 months ago

Hello, team! I'm looking into the new way to include bsl language into libprisma introduced recently. I want to clarify what does UTF-16 ranges in sanitize mean? Do you not support /uXXXX sequences at all or some concrete range (\uD800+ for example)? If /uXXXX sequences are not supported at all, is there any way to add cyrillic letters into regex? will regular [а-яё] work?

/cc @FrayxRulez

nixel2007 commented 8 months ago

Anyone? :)

mm8191 commented 2 weeks ago

باتشکر و احترام ونااحترام کلاهبرداری در زمانه ای ک چوس بزنی اونور کره زمین میفهمه یک بی عقلی است انجام بدی