jgm / skylighting

A Haskell syntax highlighting library with tokenizers derived from KDE syntax highlighting descriptions
195 stars 63 forks source link

Skylighting doesn't seem to style `import` tokens #147

Closed briandk closed 2 years ago

briandk commented 2 years ago

Unexpected Behavior

As far as I can tell, by default skylighting doesn't apply any text formatting to tokens of class Import. The result is that I get Python code that looks like this:

Screen Shot 2022-04-28 at 3 57 11 PM

Contrast that visual appearance with what I would get from using Pygments to handle syntax highlighting my Python code:

Screen Shot 2022-04-28 at 3 56 39 PM

The Catch

I don't use skylighting directly; I write in RMarkdown, which compiles using pandoc.

But, when I run:

pandoc --print-highlight-style pygments | grep --context=6 "Import"

The output shown is:

            "text-color": "#06287e",
            "background-color": null,
            "bold": false,
            "italic": false,
            "underline": false
        },
        "Import": {
            "text-color": null,
            "background-color": null,
            "bold": false,
            "italic": false,
            "underline": false
        },

According to pandoc's JSON output, Imports don't get any text styling. I also checked the JSON output for various other syntax highlighting schemes (zenburn, tango, etc.) and as far as I can tell, none of them style Import tokens either.

My Question

Is this behavior—applying no styling to Import tokens—intentional? If so, why?

jgm commented 2 years ago

I really don't remember. It could be that "Import" was added to the token types in KDE's highlighting at some point after the styles were created, and I just added the default style as a placeholder. I don't recall. Anyway, it can be changed, if someone tells me what color I should use.

I note that several other styles also set ImportTok to defStyle, which supports the hypothesis above.

briandk commented 2 years ago

@jgm Thanks for the quick response!

Deciding which color to use might be a little tricky, because as far as I can tell Pygments and Skylight handle tokenization of Python code slightly differently.

I prepared a snippet of example Python code and a table comparing tokenization of it in Pygments and Python, but I'll reproduce it here.

Example Python Code

from plotnine import geom_text
def myFunction(x: str):
    pass

Tokenization Comparison Table

Token Pygments Token Type[^comparing-python-tokenization-1] Pandoc Token Type[^comparing-python-tokenization-2]
from kn (Keyword.Namespace) im (ImportTok)
plotnine nn (Name.Namespace) [no token type]
import kn (Keyword.Namespace) im (ImportTok)
geom_text n (Name) [no token type]
--- --- ---
def k (Keyword) kw (KeywordTok)
myFunction nf (Name.Function) [no token type]
( p (Punctuation) [no token type]
x n (Name) [no token type]
: p (Punctuation) [no token type]
str nb (Name.Builtin) bu (BuiltInTok)
): p (Punctuation) [no token type]
pass k (Keyword) cf (ControlFlowTok)

[^comparing-python-tokenization-1]: Pygments token types are taken from: https://github.com/pygments/pygments/blob/ab4afd821aa41403f7a0b1e714112c40b2ad843b/pygments/styles/tango.py#L44-L140

[^comparing-python-tokenization-2]: Pandoc token types are taken from: https://github.com/jgm/skylighting/blob/5ccee442dff7eb00423e807f59e24f2a0082bcaa/skylighting-core/src/Skylighting/Styles.hs#L109-L146

jgm commented 2 years ago

It's not obvious, you're right. Do you want to make a suggestion?

jgm commented 2 years ago

closed by PR #148