lebrice / SimpleParsing

Simple, Elegant, Typed Argument Parsing with argparse
MIT License
427 stars 52 forks source link

Simple parsing with nesting breaks line_profiler (kernprof) #100

Open yuvval opened 2 years ago

yuvval commented 2 years ago

Describe the bug When calling a script with kernprof (line_profiler), it breaks if simple parsing uses nesting. There is no issue if nesting is not used.

To Reproduce

tmp.py

from dataclasses import dataclass
from time import sleep
from simple_parsing import ArgumentParser

@dataclass
class NestedArgs():
    nested1: str = 'nested1'

@dataclass
class Opts():
    nested: NestedArgs  # when this variable is commented out, then the profiling is successful
    arg1: bool = False
    arg2: int = 4

@profile
def func_to_line_prof():
    sleep(0.01)
    sleep(1)
    sleep(3)

def main():
    parser = ArgumentParser()
    parser.add_arguments(Opts, dest='cfg')
    args = parser.parse_args().cfg
    func_to_line_prof()

if __name__ == '__main__':
    main()

On commandline (bash):

pip install line_profiler
kernprof -v -l tmp.py

Expected behavior Script will run and func_to_line_prof() will get profiled.

rote profile results to tmp.py.lprof
Timer unit: 1e-06 s

Total time: 4.01383 s
File: tmp.py
Function: func_to_line_prof at line 16

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    16                                           @profile
    17                                           def func_to_line_prof():
    18         1      10070.0  10070.0      0.3      sleep(0.01)
    19         1    1001030.0 1001030.0     24.9      sleep(1)
    20         1    3002727.0 3002727.0     74.8      sleep(3)

Actual behavior An exception is raised

raceback (most recent call last):                                                                                                           
  File "/home/yatzmon/anaconda3/envs/rlgpu/bin/kernprof", line 8, in <module>                                                                
    sys.exit(main())                                                                                                                         
  File "/home/yatzmon/anaconda3/envs/rlgpu/lib/python3.7/site-packages/kernprof.py", line 234, in main                                       
    execfile(script_file, ns, ns)                                                                                                            
  File "/home/yatzmon/anaconda3/envs/rlgpu/lib/python3.7/site-packages/kernprof.py", line 39, in execfile
    exec_(compile(f.read(), filename, 'exec'), globals, locals)
  File "tmp.py", line 30, in <module>
    main()
  File "tmp.py", line 25, in main
    args = parser.parse_args().cfg
  File "/home/yatzmon/anaconda3/envs/rlgpu/lib/python3.7/argparse.py", line 1755, in parse_args
    args, argv = self.parse_known_args(args, namespace)
  File "/home/yatzmon/anaconda3/envs/rlgpu/lib/python3.7/site-packages/simple_parsing/parsing.py", line 162, in parse_known_args
    self._preprocessing()
  File "/home/yatzmon/anaconda3/envs/rlgpu/lib/python3.7/site-packages/simple_parsing/parsing.py", line 221, in _preprocessing
    wrapper.add_arguments(parser=self)
  File "/home/yatzmon/anaconda3/envs/rlgpu/lib/python3.7/site-packages/simple_parsing/wrappers/dataclass_wrapper.py", line 90, in add_arguments
    group = parser.add_argument_group(title=self.title, description=self.description)
  File "/home/yatzmon/anaconda3/envs/rlgpu/lib/python3.7/site-packages/simple_parsing/wrappers/dataclass_wrapper.py", line 178, in description
    doc = docstring.get_attribute_docstring(self.parent.dataclass, self._field.name)            
  File "/home/yatzmon/anaconda3/envs/rlgpu/lib/python3.7/site-packages/simple_parsing/docstring.py", line 38, in get_attribute_docstring
    source = inspect.getsource(some_dataclass)
  File "/home/yatzmon/anaconda3/envs/rlgpu/lib/python3.7/inspect.py", line 973, in getsource
    lines, lnum = getsourcelines(object)
  File "/home/yatzmon/anaconda3/envs/rlgpu/lib/python3.7/inspect.py", line 955, in getsourcelines
    lines, lnum = findsource(object)
  File "/home/yatzmon/anaconda3/envs/rlgpu/lib/python3.7/inspect.py", line 812, in findsource
    raise OSError('could not find class definition')
OSError: could not find class definition

Desktop (please complete the following information):

lebrice commented 2 years ago

Hello there @yuvval !

Is this specific to dataclasses? Or to Simple-parsing?

Dataclasses generate a str that is then evaled, so thats probably the issue here.

yuvval commented 2 years ago

Hi Fabrice! Thanks for helping.

Looks like it is related to simple-parsing: It works when I removed the calls for simple parsing, and replaced them with just getting an instance of the dataclasses.

For that I used the following code:

from dataclasses import dataclass
from time import sleep

@dataclass
class NestedArgs():
    nested1: str = 'nested1'

@dataclass
class Opts():
    nested: NestedArgs
    arg1: bool = False
    arg2: int = 4

@profile
def func_to_line_prof():
    sleep(0.01)
    sleep(1)
    sleep(3)

def main():
    o = Opts(NestedArgs())
    func_to_line_prof()

if __name__ == '__main__':
    main()
lebrice commented 2 years ago

FYI: The issue is happening when simple-parsing tries to fetch the docstring of a given field on a dataclass, in order to create the value that gets passed to parser.add_argument(f"--{field.name}", ..., help=<this value here>)

I'm not sure how to fix this, since what seems to be happening is that inspect is incapable of finding the source code of the given class. Could this be due to the line profiler manipulating the code objects somehow? In any case, I don't think I have an easy fix for this, apart from perhaps adding a help string for every field using the field function from simple_parsing.helpers:

@dataclass
class Foo:
    """ Foo docstring """
    a: int = field(default=123, help="Help string for A")

That way, simple-parsing won't have to look for the source file of the dataclass when trying to construct the help string for the field, since it's already provided.