Closed jan-glx closed 8 years ago
Good idea, thanks.
Here are my current thoughts for command line use (are you using pprofile as a module ?):
--exclude glob_pattern [--exclude glob_pattern [...]]
Exclude files whose path starts with any pattern.
--include glob_pattern [--include glob_pattern [...]]
Include files whose path starts with any pattern and would have been otherwise excluded.
Ie, by default everything is included, then one will exclude specific paths when found in profiling result, then maybe re-include paths (which can be useful when a few system-wide packages need profiling, but the majority does not).
Does it look usable ?
Sounds great!
Maybe you can exclude everything first if only --include
is given.
It would also be very handy if it was possible to specify relative paths (because you said starts with - - just get the absolute path for all arguments first)
Additionally for the probably most common use case you could add a --exclude_pythonpath
switch to exclude everything in the sys.path
folders.
All good points, thanks !
Thinking more about it, there is one complication to glob patterns: I do not want to rely on the actual file tree (ex: match files inside an egg, or in-ZODB) so I cannot use on glob
module verbatim. I cannot just use fnmatch
either, although it has the API I need, as "/a*/c" matching "/a/b/c" would be surprising.
I'm also somewhat reluctant about regexes, as their usage in a (very likely) filename context would be surprising.
@jan-glx Hey, I finally got around to implement this feature.
In the end, I chose regex as the exclusion/inclusion syntax, because all fnmatch schemes I could think of would have hard to understand effects, or would be hard to write (requiring silly things like --exclude /* --exclude /*/*
etc) and be likely hard to write outside of *nix paths.
Please give current master a try, check the new option group and tell me what you think.
Cool! Regex sounds like a powerful solution; would I use it like pproflile myscrip.py --include '/^C:\\\\path\\to\\my project\\.*/'
, or how?
I just tested the --exclude_syspath
option, it works nicely with the deterministic profiler.
But when I use statistical profiling by specifying -s 0.1
, I get the following error:
Traceback (most recent call last):
File "C:\Anaconda3\envs\py27\Scripts\pprofile-script.py", line 9, in <module>
load_entry_point('pprofile==1.8.1', 'console_scripts', 'pprofile')()
File "C:\Anaconda3\envs\py27\lib\site-packages\pprofile.py", line 952, in main
x for x in prof.getFilenameSet()
AttributeError: 'StatisticalThread' object has no attribute 'getFilenameSet'
would I use it like pproflile myscrip.py --include '/^C:\path\to\my project.*/', or how?
There is no need for leading & trailing slashes as the code already expects regexes. Something like this should work:
pproflile myscrip.py --include '^C:\\\\path\\to\\my project\\.*'
But when I use statistical profiling by specifying -s 0.1, I get the following error:
Ouch, nice catch, thanks. Fixed in master & added in automated tests.
Two more notes about exclude/include:
1) Regexes really apply to whatever python thinks is the source file name. So if "myscript.py" is somewhere in the "c:\path\to\my project\" subtree, above regex will exclude samples from "myscript.py": python considers it does not know the path of the excuted file:
$ cat printfile.py
print '__file__=', repr(__file__)
$ cat importprintfile.py
import printfile
$ python printfile.py
__file__= 'printfile.py'
$ python importprintfile.py
__file__= '/tmp/printfile.py'
2) --exclude-syspath
only excludes the sys.path
as it is while still executing pprofile.py itself. So if profiled script is part of an installed egg (for example), not much will be actually profiled. I'll have to refine this option further before I can release pprofile with it.
Released in pprofile 1.9 .
I'd love to have the ability to limit the output of pprofile to the files of my code(I don't mind where libraries spend their time). From a short look at your code it seems to be almost implemented.