rodolphebarbanneau / python-docstring-highlighter

Syntax highlighting for Python Docstring in VSCode.
https://marketplace.visualstudio.com/items?itemName=rodolphebarbanneau.python-docstring-highlighter
MIT License
52 stars 2 forks source link

Return Argument Types Not Properly Highlighted #7

Open scarere opened 3 months ago

scarere commented 3 months ago

Using the google docstring format, the return type is not properly highlighted when using type hints from the typing package. Also should the return type not be highlighted as the same colour as the type for the input arguments (light grey in this instance)?

Screenshot from 2024-08-07 15-50-38

rodolphebarbanneau commented 1 month ago

Thank you for your feedback and for bringing this to my attention.

You're correct that in the examples you provided, both input and TorchPredType are colorized the same way. This is due to the limitations of TextMate grammar, which the extension uses for highlighting.

The core issue is that it's challenging to reliably distinguish between type annotations and variable names without more complex parsing. Considering your examples:

  1. input (TorchInputType): The model inputs
  2. TorchPredType: model outputs indexed by name

In the first case, input is a variable name, while in the second, TorchPredType is a type. However, using TextMate grammar, it's not possible to consistently differentiate these cases.

The main limitation of TextMate grammar is that it can only apply regex patterns to single lines. This constraint makes it impossible to use context from surrounding lines to determine whether a term is a type or a variable name.

I've considered using heuristics (like capitalizing the first letter for types), but this wouldn't work universally as variable names can also be capitalized.

Given these constraints, I've opted for a more conservative approach to avoid potentially incorrect highlighting. However, I'm open to suggestions if you have ideas on how to improve this within the single-line limitations of TextMate grammar.

Regarding the Dict[str, torch.Tensor]: ... example you mentioned, which currently doesn't get colorized at all, I agree that this presents an opportunity for enhancement. We could implement a regex pattern to colorize "Dict" and then apply a lighter color to the contents within the square brackets [str, torch.Tensor].

I'll work on implementing this enhancement in the next update. Thank you for pointing this out!

scarere commented 1 month ago

Hi @rodolphebarbanneau, thankyou for responding. I see what you're saying. I was going to suggest assuming all return parameters are types and all argument parameters are names if only a single word is provided, but I see now that without context from surrounding lines, there is no way to differentiate an input parameter from an output parameter.

I'm no an expert with TextMate grammar, but is there a way to define a Section scope that is a subscope of docstring.numpy that starts with a section heading and ends with the next section heading or triple quotes? I see here that you can use the begin and end arguments to have matches that span several lines. In this way you could differentiate between the Args and Returns sections. Then you could assume that args always take the form arg_name (arg_type): or arg_name: and that returns always takes the form arg_name (arg_type): or arg_type:

As for the enhancement to the Dict typing, I think for sure it should be colorized. This is just a matter of allowing square brackets. One thing to keep in mind is that you can have infinite nesting of types in this way. For example

tuple[dict[str, list[dict[str, list[int]], int]

So having a seperate color for types within the square brackets might be a little more complex than just defining two colors. I'd say the easiest option is to leave it all as one color. Alternative approaches that you might find interesting would be to have a color wheel of lets say 3-4 colours that you cycle through for each level of nesting. If you have 4 colours then level 5 and level 1 would have the same colour. Alternatively you could have a colour gradient with a large number of colours that are defined by a base colour and then subsequent colours get progressively lighter, with this approach you could just define enough steps within the gradient such that the maximum is never likely to be exceeded.