plasma-umass / scalene

Scalene: a high-performance, high-precision CPU, GPU, and memory profiler for Python with AI-powered optimization proposals
Apache License 2.0
11.57k stars 388 forks source link

Enable (or document) a Python API #240

Open nixjdm opened 3 years ago

nixjdm commented 3 years ago

Is your feature request related to a problem? Please describe. I do not know how to import scalene to profile a callable. I would like to do this with several of the CLI options supported. This seems unsupported, or at least undocumented. If this were possible, many things could be more easily done such as:

I'm sure more could be done, but those three are personal desires of mine, and an exposed python API seems to be at the heart of the matter for each of them.

Describe the solution you'd like Here are some examples of things I'd find useful

import scalene

# op_args / op_kwargs are for foo to ingest
foo_profiled = scalene.profile(foo, op_args, op_kwargs, profile_all=True)
foo_profiled.write_html(output='file.html', reduced=True)

print(foo_profiled.results)   # dumps to stdout
print(foo_profiled.reduced_results)   # dumps to stdout

ret = foo_profiled.ret   # what did foo return?
error = foo_profiled.error  # did foo throw an exception?

Describe alternatives you've considered Is some of this already possible?

emeryberger commented 2 years ago

A bit late to the party but wanted to let you know we now have exposed a simple API for programmatically managing profiling - you can now import scalene, and do scalene_profiler.start() to start profiling, and then scalene_profiler.stop() to stop (since profiling is initially on by default, you probably want to launch Scalene with the --off command-line option). This is now documented in the README.md. I know this is not everything on your wish list, but pointing in the right direction :).

nixjdm commented 2 years ago

Thank you, this does offer a novel feature set, but it unfortunately doesn't really cover the core of the use cases I'm imagining and would use. The main issue is that this still requires the use of the Scalene CLI. That means I can't start/stop through just a Python API (i.e., Python callables).

I'd need that if I were to profile anything kicked off through a Python job scheduler, like airflow, dask, celery, pytest, etc, where all that's often provided to the scheduler is a callable. I'd need that callable to invoke Scalene all on its own. It would also hopefully enable me to profile something from a standard Python REPL. I don't think I can do all of that currently (at least, in a non-hacky* way) if I need to invoke the Scalene CLI.

*I could for instance have a Python callable drop to a CLI, to then run my real, intended callable, as a hacky work-around, but there are quite a few problems with that.

Still, thanks for your effort on Scalene!

adamchainz commented 2 years ago

I would also like to see a Python API for use in a django-debug-toolbar plugin. This would allow users to check a box, and then scalene would be enabled per-request, until they unchecked the box - no need to stop/start Django or switch command.

emeryberger commented 2 years ago

Marking as a feature request. Please upvote if you are interested in this feature (with a "thumbs-up" to this comment); if you have more use cases, we'd be interested in hearing them.

remidebette commented 2 years ago

Hi,

This feature would indeed be very interesting. With the generalisation of containerisation / PaaS / FaaS, sometimes the developers code is run on platforms where they do not control fully the execution of the code, with python-only entrypoints (thinking of AWS Lambdas for example)

In that case it would be beneficial to have a programmatic way of running scalene, at least to dump a profile html file.

I would add that it is in performance investigation on those platforms where we control the least how our code is built and deployed that a tool such as scalene could shine. Alternately, developers go back to modifying their code with timers and wait for slow builds and runs before getting a feedback. With the low performance impact of scalene we could systematise its use during a development phase or in tests.

dustMason commented 1 year ago

To diagnose slow endpoints on our django server, we created a middleware that captures profiles of everything that happens while servicing certain routes. It's controlled with a feature flagging tool so we can enable/disable routes in real time. The resulting profiles are uploaded to blob storage (S3) for later retrieval and analysis. We would love to use this python API to control scalene in such a middleware.

andyliucode commented 1 year ago

We have containerized Python jobs that run in Kubernetes. Our platform has python-only entrypoints. Until a Python-only API for scalene exists, sadly we cannot use it to profile our code.