rodolphebarbanneau / python-docstring-highlighter

Syntax highlighting for Python Docstring in VSCode.
https://marketplace.visualstudio.com/items?itemName=rodolphebarbanneau.python-docstring-highlighter
MIT License
53 stars 2 forks source link

Some instances where things are improperly highlighted with numpy syntax #8

Open scarere opened 2 months ago

scarere commented 2 months ago

Does not properly recognise the 'See Also' section of the numpy format because the header has a space in it. Also, for instances where a parameter description is multiple lines, if the last line only has a single word, it gets mistaken as an argument. In fact any line with a single word is mistaken as an argument. I propose using indentation to figure out whether something is part of a description or is a argument or return type.

scarere commented 2 months ago

Also forgot to mention that return types that start with a capital letter (eg. Callable, Sequence) are mistaken as sections. Further more if the type has square brackets (eg. Sequence[str]) then it is not highlighted at all

scarere commented 2 months ago

Just noticed another case in which highlighting doesn't work. In numpy docstrings one can have two parameters of the same type on the same line. Eg.

"""
Parameters
---------------
array1, array2 : NDArray
    Two input paramaters of the same type.
"""

It seems the space and the comma throw off docstring highlighter from detecting them as parameters

rodolphebarbanneau commented 1 month ago

Hello @scarere,

Thank you for your additional feedback. Let me address each point you've raised:

  1. See Also and similar sections: I will implement a new rule to colorize sections where all words are capitalized.

  2. Multi-line descriptions ending with a single word: This remains challenging as we can't determine if it's a paragraph or a new section without multi-line context. The current rule prevents colorization with a final dot (e.g., foo is colorized but foo. isn't).

  3. Callable, Sequence: If a line contains only Callable, there's no way to infer whether it's a type or a section title (except if we add for instance a list of words to check within the regex, but I'm not considering that option for the moment). This is a limitation of the single-line regex approach of TextMate.

  4. Sequence[str]: This will be fixed in the next version.

  5. Parameters of the same type on a single line (NumPy style): Indeed, the current pattern doesn't check for two or more arguments. This will be fixed in the upcoming version.

To summarize, the following enhancements are planned for the next version:

If you notice any other issues, please let me know.

Thank you again for your valuable input and for helping improve this extension.

rodolphebarbanneau commented 1 month ago

The extension has been updated to version 0.2.4, addressing the issues you reported. The update is now available on the VSCode marketplace.

Feel free to rate the extension on the marketplace to support it if you find it useful.

Thanks again for your feedback.

scarere commented 1 month ago

@rodolphebarbanneau thank you for your additional comments. I will definitely rate the extension on the marketplace to support it as I do find this to be a useful extension. I hope you continue to work on it and improve it's functionality.

A few responses to your comments

  1. What do you think about using indentation? This will not work for numpy style docstrings, but it will work for google style docstrings.

    • Quickfix for google style docstrings
      • Section headings should never begin with any indentation (outside of the docstring's existing indentation level).
      • Anything that is indented one level should be assumed to be an argument name, type or paragraph text.
      • Anything that is indented 2 levels should be assumed to be paragraph/description text and not be highlighted.
    • To make this work with numpy style docstrings we need multiline context, but again I would ask if it's possible to define scopes that span multiple lines using begin and end as mentioned in this issue
      • A section heading could begin with any capitalize non-indented text and end with '-------'
      • A section could begin with a section heading and end with the next section heading or triple quotes
    • I imagine it must be possible somehow to have multiline context, otherwise how does textmate grammar recognize multiline docstrings in the first place?
  2. Same as my response to point 2. This would be resolved for google docstrings by using indentation, and for numpy docstrings we would need to figure out how get multiline context

I would offer to help create some PR's but my entire knowledge of regex and textmate stems from me briefly looking into how this package works, so it would take me awhile to find the spare time to teach myself the tools necessary to contributing to this package.

scarere commented 1 month ago

I'd also like to mention that a nice long-term goal to add as a feature would be seperate themes/custom themes without having to manually edit the settings.json. perhaps a path to a themes file, or a couple preset themes to choose from. I think it's nice for the highlighting to be different from the vs code theme so that docstrings can be differentiated from code.