Open charliermarsh opened 2 years ago
This would mean that we could do something like the following, right?
import ruff
errors = ruff.check_files(list_of_paths)
...
Thanks!
Yup, that's right!
@messense - I wasn't certain on this last time -- if we bundle a Python API with Ruff, will we need to build separate wheels for every Python version?
If you can use abi3
features, one wheel per platform, otherwise you need to build separate wheels for every Python version.
Awesome thank you. I think we should be able to do that, so maybe this will be really straightforward.
Hello there, do you happen to have a rough timeline for when (if?) this is going to happen? I'm looking to integrate ruff into a tool I'm developing, which would require an API of some sort. It would be very helpful to know if this is something I can wait on, or look for another solution / workaround!
@provinzkraut - It's definitely going to happen! I could probably ship it within the next week or so. I'd just been punting on it until I had more people asking for it.
Could I hear a bit more about your use-case, if you don't mind sharing?
@charliermarsh That's good to hear!
Could I hear a bit more about your use-case, if you don't mind sharing?
Sure. I'm working on a markdown extension to automatically generate pymdown tabs for different Python versions from a source version, i.e. generate 3.7, 3.8, 3.10 tabs from a 3.7 source (repo).
Currently I'm using pyupgrade
to generate the versions and autoflake
to clean imports that have become superfluous. Especially autoflake
is quite slow, making up a majority of the extensions runtime. Since ruff is way faster at this, I'd like to use it (also one less dependency). I fiddled around with using the CLI version, but that's messy and a performance degradation.
@provinzkraut - Ok, cool. Let me see what I can do. I don't know if you're comfortable reading Rust, but would the current Rust public API suit your use-case, were it callable from Python with Python objects etc.?
In short: it takes a file path (to find the pyproject.toml
), the raw Python source code, and an autofix setting, and returns a list of checks (which themselves include the raw fixes / patches).
I'm guessing that for your use-case, what you actually want is a function that takes source code (plus settings, to enable a list of checks) and returns fixed source code?
but would the current Rust public API suit your use-case, were it callable from Python with Python objects etc.?
I looked at this yesterday because I though that maybe it could be as simple as adding a tiny wrapper around the rust lib myself, but it seems to be a bit more involved. The current API doesn't really lend itself that well to my usecase.
I'm guessing that for your use-case, what you actually want is a function that takes source code (plus settings, to enable a list of checks) and returns fixed source code?
That would be ideal, yes. Dealing with a list of checks and extracting what I need from it also wouldn't be that big of an issue, but passing in configuration directly and omitting the config file is crucial, both for the needed configurability (I need to run the fixers with varying configuration for every invocation) and performance (I'm running the fixers many times on small snippets, which means the overhead of looking for and parsing a pyproject.toml
every time adds up).
I'm working on this now.
I had a need to execute Ruff as an Alembic post write hook. I came up with a very hamfisted approach that I found from the distributed __main__.py
alembic.ini
:
[post_write_hooks]
# post_write_hooks defines scripts or Python functions that are run
# on newly generated revision scripts. See the documentation for further
# detail and examples
# format using "black" - use the console_scripts runner, against the "black" entrypoint
hooks = ruff, black
ruff.type = ruff
black.type = console_scripts
black.entrypoint = black
and then Alembic's env.py
:
import os
import sysconfig
from alembic.script import write_hooks
@write_hooks.register("ruff")
def run_ruff(filename, options):
ruff = os.path.join(sysconfig.get_path("scripts"), "ruff")
os.spawnv(os.P_WAIT, ruff, [ruff, filename, "--fix", "--exit-zero"])
👍 Yup that should be safe to do! (The downside being that you have to go through the CLI rather than calling a function directly. Hoping to enable that soon but not working on it right now.)
Is the plan here to make a Python library that links to ruff directly? I want something that I can use in an interactive Python REPL to check for errors as the user types stuff, and shelling out to a subprocess on each character typed doesn't sound like a good idea (especially if I also have to write out the code to a tempfile or heredoc).
If you're curious, here's what I'm currently using with pyflakes https://github.com/asmeurer/mypython/blob/a836d0956a6443f7a85a032dc625ff3da1479a91/mypython/processors.py#L196. The code is complicated in part because pyflakes doesn't handle syntax errors very well, so I have to parse them separately. I haven't checked if ruff handles them better. There's lot of opportunities to improve over pyflakes' barebones Python API.
The main thing I would want from a Ruff Python API is a function that takes a string of Python code and returns a list of errors with line number, start and stop column numbers (where relevant), and the error message. Being able to get corresponding fixes would be nice too, I guess. The best API I can think of for a "fix" would be to return the whole block of code with the specific warning fixed, along with a new line and column number corresponding to the line and column of the original warning (so that I can interactively keep the cursor in the "same" location).
I'm happy to discuss API ideas more in depth or test out any prototypes if you're interested.
That sounds cool!
If you're curious, here's what I'm currently using with pyflakes asmeurer/mypython@a836d09/mypython/processors.py#L196. The code is complicated in part because pyflakes doesn't handle syntax errors very well, so I have to parse them separately. I haven't checked if ruff handles them better. There's lot of opportunities to improve over pyflakes' barebones Python API.
Ruff creates a diagnostic for files with syntax errors. Adopting a more error-resilient parser is something that we consider doing.
The main thing I would want from a Ruff Python API is a function that takes a string of Python code and returns a list of errors with line number, start and stop column numbers (where relevant), and the error message.
That sounds reasonable, but we aren't there yet (your best shot is to call into the CLI). One of the biggest problems of exposing a linter API right now is that Ruff writes one-off warnings to stdout and relies on the global state to track whether to write the warning. Cleaning this up probably requires a larger refactoring around the diagnostic system... so that may take a while.
For what it's worth, we power ruff-lsp
and the VS Code extension over subprocess, and the CLI actually supports enough behavior to power the operations needed there. For example, you can use --format json
to get a structured list of violations and their fixes. Similarly, if you pass input via stdin
, and run with --fix
, we print the "fixed" output to stdout
.
For what it's worth, we power ruff-lsp and the VS Code extension over subprocess, and the CLI actually supports enough behavior to power the operations needed there
How do you feel about adding a Python module that wraps this up in a convenient API?
I've been using the solution you suggested in a few of my tools now, and not having to implement that boilerplate every time would certainly be nice.
If that sounds good to you, I'd be happy to contribute.
I'm keen to replace autoflake
+isort
with ruff
in my shed
all-in-one autoformatter - the subprocess trick works pretty well, except that if there's any way to change the isort settings in ruff without a config file I can't see it - and running isolated from any config is pretty important in this use-case. Any suggestions, or do I just need to wait for the library interface in this issue?
Adding a data point: in mkdocstrings-python we format function signatures with Black if it is installed. We would like to support Ruff to, but spawning a subprocess for each signature is very costly, so we would greatly appreciate a Python binding that doesn't use subprocesses :slightly_smiling_face: A wrapper that hides the subprocess calls sounds nice, but won't be enough for our use-case.
@pawamoy that sounds neat. We plan to integrate our LSP into ruff (implemented in Rust). I know, it's not as convenient as a Python API but it would allow you to format files without spawning a process for every signature (although it might still be very costly because it requires multiple LSP calls to format a single code snipped)
By calls do you mean network calls? Or could we somehow spawn the LSP server locally (like a daemon)?
You would spawn the LSP like a daemon and communicate over stdin/stdout.
Ah, interesting. Then yeah, that's already much better than subprocesses :slightly_smiling_face: Thanks for the info!
Adding a data point: in mkdocstrings-python we format function signatures with Black if it is installed. We would like to support Ruff to, but spawning a subprocess for each signature is very costly, so we would greatly appreciate a Python binding that doesn't use subprocesses :slightly_smiling_face: A wrapper that hides the subprocess calls sounds nice, but won't be enough for our use-case.
I put together an experimental package that uses PyO3 to wrap the Ruff formatter in a Python API that doesn't require any subprocesses. I'd still consider it alpha at best (there's only one callable function), but maybe it could be helpful to others as well?
Amazing, thanks for sharing! I'll check it out :)
@charliermarsh just checking in - is there any way to configure the isort
settings in --isolated
mode, or do I just have to wait? No worries if so, I'm just looking forward to replacing black
too...
@charliermarsh just checking in - is there any way to configure the
isort
settings in--isolated
mode, or do I just have to wait? No worries if so, I'm just looking forward to replacingblack
too...
@Zac-HD, yes, there is! We recently extended the --config
flag so that arbitrary configuration options can be overridden via the command line using "inline TOML": https://docs.astral.sh/ruff/configuration/#the-config-cli-flag. So to override the isort extra-standard-library
setting in --isolated
mode (for example), you'd do something like ruff check path/to/file.py --config "lint.isort.extra-standard-library = ['path']"
.
Adding another data point:
In edvart, we are currently using isort
to sort imports in Python code which is being dynamically.
With a Python API, we could fully switch to ruff. For now, we are using ruff to format the source code, but keeping isort
to format the generated code.
Another data point: it would make it easier to replace programmatic calls to black, like in mdsformat-black: https://github.com/hukkin/mdformat-black/blob/master/mdformat_black/__init__.py
def format_python(unformatted: str, _info_str: str) -> str:
return black.format_str(unformatted, mode=black.Mode())
I’d want to use an API like black.format_str
over in blacken-docs
, where Ruff support is tracked in this issue: https://github.com/adamchainz/blacken-docs/issues/352
Not the most elegant solution and I haven't tried it myself, but it should soon be possible to call the ruff WASM API from Python:
Considering that we have a WASM API now, I'm open to reconsidering a PyO3 API. Let me discuss this internally.
Note: The API would not fall under any semver guarantees. We expect a major breaking change once we introduce multifile analysis. Practically, the API hasn't changed in months.
I would be open to expose a Ruff Pyo3 API:
ruff_wasm
I'm happy to support if anyone's interested in contributing the API to ruff.
Hey, just to add another data-point: we at Vizro would also love to be able to invoke ruff
from within python without subprocess. Something like black.format_str
indeed :)
Same here -- would love to replace black.format_str
in automatically formatting jupyter cells with https://github.com/n8henrie/jupyter-black/ !
@maxschulz-COL @n8henrie @adamchainz I would like to remind folks that https://github.com/amyreese/ruff-api has a working, simple API wrapping both the formatter and import sorter from Ruff, just a pip install ruff-api
away. We have been using it to successfully migrate from Black in our monorepo while still maintaining our existing integrations/tooling written in Python. :)
That looks great, but it is also documented as "highly experimental", so people maybe reluctant to add that to their tool chains. Why don't you contribute that to the ruff project?
Agreed -- it would be great to have this under ruff's umbrella!
See: #593