terrencepreilly / darglint

A python documentation linter which checks that the docstring description matches the definition.
MIT License
483 stars 41 forks source link

pathological performance #126

Closed tekumara closed 4 years ago

tekumara commented 4 years ago

darglint takes 25secs to parse the docstring below, and produces no errors

.darglint

[darglint]
strictness=long
docstring_style=google

test.py

import pandas as pd

def calculate(df: pd.DataFrame) -> pd.DataFrame:
    """Do calculations

    Arguments:
        - df: dataframe with the schema below
            Column                   Dtype
            ------                   -----
            id                       str
            pred                     int
            value                    float
    Returns:
        - A dataframe as below
            Column                  Dtype
            ------                  -----
            id                      object 
            count1                  int64  
            count2                  int64  
            value                   float64

    """
    pass
$ time darglint test.py
darglint   25.30s user 0.15s system 99% cpu 25.677 total

darglint 1.5.4

tekumara commented 4 years ago

If I insert a newline between the Arguments and Returns blocks, the time drops to 5 secs (which is still reasonably long, ideally it would be ~100ms), eg:

import pandas as pd

def calculate(df: pd.DataFrame) -> pd.DataFrame:
    """Do calculations

    Arguments:
        - df: dataframe with the schema below
            Column                   Dtype
            ------                   -----
            id                       str
            pred                     int
            value                    float

    Returns:
        - A dataframe as below
            Column                  Dtype
            ------                  -----
            id                      object 
            count1                  int64  
            count2                  int64  
            value                   float64

    """
    pass
$ time darglint test.py
test.py:calculate:4: DAR101: - df
test.py:calculate:15: DAR202: + return

darglint   5.83s user 0.07s system 97% cpu 6.032 total
terrencepreilly commented 4 years ago

See #101, #111.