python-lsp / python-lsp-server

Fork of the python-language-server project, maintained by the Spyder IDE team and the community
MIT License
1.75k stars 186 forks source link

Missing docstring for pandas functions #477

Closed lukelbd closed 8 months ago

lukelbd commented 8 months ago

Overview

This may be an upstream issue. But currently when I request hover information on the read_csv or read_table functions from the popular data-science package pandas, the docstring help is missing, even though it is present on pd.read_csv.__doc__ or help(pd.read_csv) within a python session (see below).

Strangely, this issue does not occur for other pandas functions e.g. read_fwf, so it must be something particular to the docstring implementation on those two functions. If it helps here is the read_csv source and read_fwf source.

Is there any way python-lsp-server can be improved to cover their docstring implementation and consistently capture the __doc__ text? Thanks!

Versions

pandas                    1.4.2              py310h514ec25_1      conda-forge
python                    3.10.12            had23ca6_0_cpython   conda-forge
python-lsp-server         1.8.2              pyhd8ed1ab_0         conda-forge
python-lsp-server-base    1.8.2              pyhd8ed1ab_0         conda-forge

Current behavior

read_csv docstring:

Screenshot 2023-11-01 at 12 29 53

read_csv pylsp hover:

Screenshot 2023-11-01 at 12 28 56 Screenshot 2023-11-01 at 12 29 03

Desired behavior

read_fwf docstring:

Screenshot 2023-11-01 at 12 30 07

read_fwf pylsp hover:

Screenshot 2023-11-01 at 12 29 15 Screenshot 2023-11-01 at 12 29 22
krassowski commented 8 months ago

Thank you for reporting! This appears to be a jedi issue, compare:

import jedi
jedi.Script("import pandas as pd; pd.read_csv").help(column=30, line=1)[0].docstring(raw=True)

''

with:

import jedi
jedi.Script("import pandas as pd; pd.read_fwf").help(column=30, line=1)[0].docstring(raw=True)

"Read a table of fixed-width formatted lines into DataFrame.\n\nAlso supports optionally iterating or breaking of the file\ninto chunks.\n\nAdditional help can be found in the online docs for IO Tools\n<https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html>_.\n\nParameters\n----------\nfilepath_or_buffer : str, path object, or file-like object\n String, path object (implementing os.PathLike[str]), or file-like\n object implementing a text read() function.The string could be a URL.\n Valid URL schemes include http, ftp, s3, and file. For file URLs, a host is\n expected. A local file could be:\n file://localhost/path/to/table.csv.\ncolspecs : list of tuple (int, int) or 'infer'. optional\n A list of tuples giving the extents of the fixed-width\n fields of each line as half-open intervals (i.e., [from, to[ ).\n String value 'infer' can be used to instruct the parser to try\n detecting the column specifications from the first 100 rows of\n the data which are not being skipped via skiprows (default='infer').\nwidths : list of int, optional\n A list of field widths which can be used instead of 'colspecs' if\n the intervals are contiguous.\ninfer_nrows : int, default 100\n The number of rows to consider when letting the parser determine the\n colspecs.\n**kwds : optional\n Optional keyword arguments can be passed to TextFileReader.\n\nReturns\n-------\nDataFrame or TextFileReader\n A comma-separated values (csv) file is returned as two-dimensional\n data structure with labeled axes.\n\nSee Also\n--------\nDataFrame.to_csv : Write DataFrame to a comma-separated values (csv) file.\nread_csv : Read a comma-separated values (csv) file into DataFrame.\n\nExamples\n--------\n>>> pd.read_fwf('data.csv') # doctest: +SKIP"

@lukelbd can you check if any of the jedi issues mention this (https://github.com/davidhalter/jedi/issues) and if not create one, linking back here?

lukelbd commented 8 months ago

@krassowski Thanks for the quick response! Have posted upstream: davidhalter/jedi#1968

lukelbd commented 8 months ago

@krassowski Thanks for the help. Seems jedi simply doesn't support non-static, runtime-generated docstrings (https://github.com/davidhalter/jedi/issues/1968).

Have actually run into this problem myself in my own projects. If I understand correctly jedi is only static analysis, but python-lsp-server is dynamic? So would it be within the scope of this project to search for runtime-generated docstrings whenever the static jedi.(...).help() tool returns nothing? I could open a new feature request thread

ccordoba12 commented 8 months ago

but python-lsp-server is dynamic?

That's not correct. This project is just a wrapper around many code completion (like Jedi), linting and formatting Python libraries to make them conform to the LSP protocol. That's it.

So would it be within the scope of this project to search for runtime-generated docstrings whenever the static jedi.(...).help() tool returns nothing?

I don't think that's feasible because the libraries that you need to import or code you need to run to make that work can have unpredictable side-effects (e.g. consuming too much memory or compute time). There's a reason why Jedi is a static analysis tool.

However, you can add Pandas to the list of modules in the option pylsp.plugins.jedi.auto_import_modules and it'll be treated by Jedi as a live module. So, the read_csv docstring should be rendered as expected for you.

lukelbd commented 7 months ago

Thanks for the tip. Unfortunately I am getting the same error. I added pandas to jedi.auto_import_modules as follows (I also tried preload.modules but it has no effect):

{
    "pylsp.plugins.jedi.auto_import_modules": ["pandas"],
    "pylsp.plugins.preload.modules": ["pandas"]
}

Can anyone else get runtime-docstrings to work with the "auto import" setting? Or maybe this is an issue I can send jedi

lukelbd commented 7 months ago

Scratch the last comment -- finally got this to work! Using vim-lsp and applying settings via vim-lsp-settings, adding the following to .vimrc seems to do the job:

  let s:pylsp_settings = {
    \ 'plugins': {'jedi': {'auto_import_modules': ['pandas']}},
  \ }
  let g:lsp_settings = {
    \ 'pylsp': {'workspace_config': {'pylsp': s:pylsp_settings}},
  \ }

Thanks again @ccordoba12 this was driving me crazy, came up really often