Closed Erotemic closed 11 months ago
Merging #222 (b0a9df4) into main (6d141fc) will decrease coverage by
17.39%
. The diff coverage is56.00%
.
+1 for the explicit import.
The vscode test explorer currently throws an error when I want to use a test with line_profiler, since it does not recognize the profile
decorator. As a consequence, it does not show any python tests, so this would be useful.
LGTM! Could also help with scripting usage.
@ta946 I believe LINE_PROFILER=1
is an environment variable, not a global variable.
@ta946 I believe
LINE_PROFILER=1
is an environment variable, not a global variable.
yup 😅 realized as soon as i pressed send. got me before i deleted the comment
ill change what i was saying to: this would be nice to have another way to enable profiling without editing the commandline.
a usecase would be running a script in pycharm, its annoying having to modify the run config whenever i want to turn profiling on or off. whereas commenting out an enable/disable
function would be alot easier
@ta946, I'm not sure if it's possible to do that while still keeping the lightweight design where the profile decorator is a no-op if you didn't explicitly request profiling. The use-case I want to enable is letting a developer keep a lot of profile
decoratorated functions around without having to manually undecorate them to go back to normal usage. If there was a way to enable/disable on the fly, then that would require wrapper logic around each function to check this condition, and that adds unacceptable overhead to normal runtime.
Another tradeoff with this design is that you get a very small, but still real, import time side effect. The library behaves differently based on the environment it had when you imported it, and there is no easy way to change that after the fact. Unfortunately, I think this is also necessary to keep the more common non-profiling case as fast as possible.
I always run my invocations via copy / pasting something to the command line, so appending --profile
to the end is very easy in this case, but I see how that would be harder for someone using a run button with a pre-configured invocation.
One way you could handle this is by running;
import os
os.environ['LINE_PROFILE'] = '1'
before your first import of line_profiler
. Then you could toggle that in the file itself. Not sure if that is good enough or not.
i think that would work, i would much rather adding those 2 lines of code than having to modify the run config.
looking at your code, could you not do something like
import line_profiler
line_profiler.enable()
@line_profiler.profile
def foo():
pass
auto_profiler.py
def enable():
global profile
profile = LineProfiler()
def disable:
global profile
profile = NoOpProfiler()
i could try testing it, but i assume that as long as you call enable before the decorator is used it shouldnt require any if checks or wrappers
@ta946 I think that suggestion is reasonable. I'll add something like that later today.
For now, I've updated this to rename the filename to "explicit_profiler" to differentiate it from the other auto-profiler work in https://github.com/pyutils/line_profiler/pull/201 (which I plan to look at soon).
I've also made is so if kernprof is being used, the explicit profiler is updated to the one kernprof is using. The new MWE demonstrates this:
# Write demo python script to disk
python -c "if 1:
import textwrap
text = textwrap.dedent(
'''
from line_profiler import profile
@profile
def fib(n):
a, b = 0, 1
while a < n:
a, b = b, a + b
fib(100)
'''
).strip()
with open('demo.py', 'w') as file:
file.write(text)
"
echo "---"
echo "## Base Case: Run without any profiling"
python demo.py
echo "---"
echo "## Option 0: Original Usage"
python -m kernprof -lv demo.py
python -m line_profiler demo.py.lprof
echo "---"
echo "## Option 1: Enable profiler with the command line"
python demo.py --profile
echo "---"
echo "## Option 1: Enable profiler with an environment variable"
LINE_PROFILE=1 python demo.py
And outputs:
---
## Base Case: Run without any profiling
---
## Option 0: Original Usage
Wrote profile results to demo.py.lprof
Timer unit: 1e-06 s
Total time: 4.109e-06 s
File: demo.py
Function: fib at line 3
Line # Hits Time Per Hit % Time Line Contents
==============================================================
3 @profile
4 def fib(n):
5 1 0.6 0.6 14.4 a, b = 0, 1
6 13 2.0 0.2 49.7 while a < n:
7 12 1.5 0.1 35.9 a, b = b, a + b
Timer unit: 1e-06 s
Total time: 4.109e-06 s
File: demo.py
Function: fib at line 3
Line # Hits Time Per Hit % Time Line Contents
==============================================================
3 @profile
4 def fib(n):
5 1 0.6 0.6 14.4 a, b = 0, 1
6 13 2.0 0.2 49.7 while a < n:
7 12 1.5 0.1 35.9 a, b = b, a + b
---
## Option 1: Enable profiler with the command line
Timer unit: 1e-09 s
Total time: 4.706e-06 s
File: /home/joncrall/code/line_profiler/demo.py
Function: fib at line 3
Line # Hits Time Per Hit % Time Line Contents
==============================================================
3 @profile
4 def fib(n):
5 1 663.0 663.0 14.1 a, b = 0, 1
6 13 2211.0 170.1 47.0 while a < n:
7 12 1832.0 152.7 38.9 a, b = b, a + b
0.00 seconds - /home/joncrall/code/line_profiler/demo.py:3 - fib
---
## Option 1: Enable profiler with an environment variable
Timer unit: 1e-09 s
Total time: 3.954e-06 s
File: /home/joncrall/code/line_profiler/demo.py
Function: fib at line 3
Line # Hits Time Per Hit % Time Line Contents
==============================================================
3 @profile
4 def fib(n):
5 1 534.0 534.0 13.5 a, b = 0, 1
6 13 1966.0 151.2 49.7 while a < n:
7 12 1454.0 121.2 36.8 a, b = b, a + b
0.00 seconds - /home/joncrall/code/line_profiler/demo.py:3 - fib
Current things I'm thinking about:
Are there too many different ways to enable profiling? Or is the flexability helpful?
If we add a line_profiler.enable
function, that only works if you run it before you decorate anything, will that be confusing to users (@ta946)?
Should we support having the --profile in sys.argv
check? This will prevent CLI applications that use parse_args
instead of parse_known_args
from using it, and it could conflict with existing CLI arguments. E.g. a user CLI application might have an existing use: --profile <path to some user profile
, and in that case that would also have the side effect of enabling profiling. Of course this means that the developer needs to have a shipped the code with line profiler installed, imported, and used, but it's something to consider. I do like having this option available because when I type a command in bash it's really nice to append some text to the end to enable profiling rather than having to prepend an environment variable at the start. (e.g. just up arrow -> space -> --profile -> enter). But I could live without this feature.
The output files profile_output.txt and profileoutput<timestamp>
.txt are hard coded and text based. Perhaps they should have the output name be configurable via environment variable, global variable? Also perhaps we should dump the lprof file as well? I do like having the text output, but it also seems heavy handed.
I could refactor this even more so none of the global-profiler logic happens until the first @profile
decorator is called. That can help prevent the immediate import-time side effects and give the user a larger window to do more configuration. I think this is probably a good idea.
Current things I'm thinking about:
- Are there too many different ways to enable profiling? Or is the flexability helpful?
I think the flexibility will be very helpful.
- Should we support having the
--profile in sys.argv
check? This will prevent CLI applications that useparse_args
instead ofparse_known_args
from using it, and it could conflict with existing CLI arguments. E.g. a user CLI application might have an existing use:--profile <path to some user profile
, and in that case that would also have the side effect of enabling profiling. Of course this means that the developer needs to have a shipped the code with line profiler installed, imported, and used, but it's something to consider. I do like having this option available because when I type a command in bash it's really nice to append some text to the end to enable profiling rather than having to prepend an environment variable at the start. (e.g. just up arrow -> space -> --profile -> enter). But I could live without this feature.
Maybe name it --line-profile in sys.argv
? Less of a risk of a collision with that, and a script calling line_profiler can set an environment variable trivially with os.environ
.
- The output files profile_output.txt and profileoutput
.txt are hard coded and text based. Perhaps they should have the output name be configurable via environment variable, global variable? Also perhaps we should dump the lprof file as well? I do like having the text output, but it also seems heavy handed.
+1. The names should be as simple yet unique as possible by default (so include the timestamp). Maybe allow the user to pass an argument such as --line-output-path /path/to/file
?
- I could refactor this even more so none of the global-profiler logic happens until the first
@profile
decorator is called. That can help prevent the immediate import-time side effects and give the user a larger window to do more configuration. I think this is probably a good idea.
What if there is no @profile decorator, when the whole script is being profiled? I'm in support of this idea, but I think the details should be clarified before work is put in.
Maybe name it --line-profile in sys.argv?
I think this is better. It aligns with the environment variable, and has a low collision chance. I'll make that change.
Maybe allow the user to pass an argument such as --line-output-path /path/to/file?
I'd like to avoid any complex parsing of the command line, but I could vendor in ubelt's argval function for this purpose, but maybe we just KISS for now.
The names should be as simple yet unique as possible by default (so include the timestamp).
I definately like having two variants of the output: 1 with a timestamp and 1 without. The one without will update as you run, but old outputs will always be available. I think the name "profile_output" is reasonably clear and unique enough. It also is easy to rm a bunch of them with a glob pattern. Of course the user can always avoid all of this by just using the original kernprof workflow where they have far more control. This is supposed to just be a convinience, so I suppose I'm talking myself into keeping the profile_output*
names. Given this I'm also thinking not giving the user the option to change the prefix - keep it simple until someone needs the feature (although maybe I can store the prefix in a global variable, so they can mess with the internal state if they really want to).
What if there is no @Profile decorator, when the whole script is being profiled?
In the context of #201, I think we can force the enable if we know we are in auto-profile mode. The main change will be to not import explicit_profiler
in __init__.py
instead, I'll define a new profile
function which will import explicit_profiler
when it is called and then wrap explicit_profiler.profile
(this has the added benefit of reducing changes needed in kernprof). This would be even easier if we could use module level __getattr__
, but I still intend on supporting Python 3.6 for the time being - it still gets a few hundred downloads / day.
- If we add a
line_profiler.enable
function, that only works if you run it before you decorate anything, will that be confusing to users (@ta946)?
I personally don't see it being any different than using the cli to enable since it'll be off by default. but the rule of thumb is assume the rest of your users wont read the manual so can't give much insight there.
I'd like to avoid any complex parsing of the command line
to add another layer of flexibility/complexity 😂 if you dont want to complicate the cli, you can use a function similar to enable to turn on/off print, save to file, and specify the path. I have a hardcoded path hidden away where the profiler outputs are saved. I personally wouldnt use this feature if it always created files next to my scripts in my repo that i would have to remove every time i profile, but i assume for others it can also be better being right next to what they are profiling
What if there is no @Profile decorator, when the whole script is being profiled?
In the context of #201, I think we can force the enable if we know we are in auto-profile mode.
for 201. the auto profiler creates a new AST and wraps everything in a @profile
decorator. So even though the user isnt adding the decorators, they will be there
So should be fine to lazy-load to get the benefit for no profile run
Ok, I've done a refactor and I think this is a cleaner design - although it does have slightly more overhead, but I think it's negligable. Every time profile
is called there is a is None
check to see if the global profiler has been setup yet, and performs the setup the first time profile
is called.
The refactor removes a lot of the global state and now defines a class GlobalProfiler
(in fact; a user could theoretically make multiple instances of this class, which helps avoid singleton problems). On module import the only thing that happens is we create an instance of it called profile
. It has a __call__
method which forwards the decorated function to the real LineProfiler object if one was created.
This design gives the user a lot more flexibility. Here is an example that demonstrates some of this:
from line_profiler import profile
@profile
def func1():
return list(range(100))
profile.enable(output_prefix='custom')
@profile
def func2():
return tuple(range(100))
profile.disable()
@profile
def func3():
return set(range(100))
profile.enable()
@profile
def func4():
return dict(zip(range(100), range(100)))
print(type(func1()))
print(type(func2()))
print(type(func3()))
print(type(func4()))
Call this file demo.py
. If we just execute it with python demo.py
we will get output for func2, and func4. The global profiler is disabled by default, so func1
is wrapped. The user explicitly enables the profiler, so func2
is wrapped. Then they explicitly disable the profile so func3
is not wrapped. Lastly they re-enable it, so func4
is wrapped. The output is written to "custom.txt", "custom.lprof" and "custom_[timestamp].txt" because they overwrote the prefix when calling enable.
If you instead execute this like python demo.py --line-profile
you get 3 function outputs, because when profile
is called for func1, the global profiler sees that nobody has tried to initialize it yet, so it looks at the CLI and enables itself. The explicit user enable just keeps it enabled, but it does have the effect of updating where the output will be written. Because the final dump function is just called at interpreter exit, whatever the state of the profile.output_prefix
variable is at the time is where the output will be written.
Any number of other configurations could be added here. The user could also opt to never explicitly enable the profiler in code (as such just rely on environment variables / CLI flags) but still have control over the config just by setting profile attributes. E.g.
from line_profiler import profile
profile.output_prefix = '/tmp/profile_output`
really like the flexibility of enabling and disabling profiling for specific functions this way
though i am worried about confusion from different results if you also use the cli flag. you would have to make it clear that using the cli would be like calling enable
at the start of the code
Something else I added to this PR - which perhaps I should have done in a different PR, but it's optional, and a clear improvement - is add rich support to show_text
. By default rich is False, but if you set rich=1
, then it tries to import it and fallsback to rich=0 if that fails, otherwise you get pretty ANSI colored output like this:
EDIT: I did move this to a different PR #225
@Theelx @Mogost @rkern and others: Looking for feedback on a proposed new feature.
High Level Change
With this new change users will be able to:
from line_profiler import profile
and then run withLINE_PROFILE=1
and line-profiler will dump profile information to stdout and disk at the end of execution.Modivation
When it comes to imports, I prefer explicit ones. I think the current primary way to use line profiler is confusing and I think it would be an enhancment if there was an alternative way to use line profiler where you just import it and use it.
I've been doing this for years in my xdev module: https://github.com/Erotemic/xdev/blob/main/xdev/profiler.py
The above code is overcomplicated, but the resulting feature is something I use often. I've ported a much simpler version to this PR, and I'm looking for comments about if this feature should be added to line-profiler or not.
This doesn't break any existing workflow (as far as I can tell - it does use a tiny bit of global state), so this isn't intended as a replacement for the old way of doing thigs. But I do think this is a simpler way way for people to get started with line-profiler.
Demonstration
With this change the "getting started" usage would be:
LINE_PROFILER=1
or'--profile' in sys.argv
, then it enables profiling and at the end of your script it will outputMinimal working example: