Open dougthor42 opened 1 week ago
ah, dang:
I just tried this locally and got the same issue. @hunshcn do you have any additional insight for this?
I think type parameter support was added to the tree-sitter grammer in https://github.com/tree-sitter/tree-sitter-python/commit/71977ea08cd38f28f00edeb3f46b2e881f9c58e0, and should be included in grammer versions 0.20.4 and later. It looks like this was first included in go-tree-sitter in https://github.com/smacker/go-tree-sitter/commit/8516e806aa967773ff7d77f79985741a58462c8e.
This is looking more and more like an issue with tree-sitter
, but I may have to update Gazelle to fallback to the old python-based parser because of limitations elsewhere.
In our code base, I narrowed things down to a single line diff:
# get_deps.py
# using this line is fine - tree-sitter correctly parses the module
def search_one_more_level(graph: dict[T, set[T]], seen: set[T], routes: list[list[T]], target: T) -> list[T] | None:
# using this line cause tree-sitter to fail parsing - see below
def search_one_more_level[T](graph: dict[T, set[T]], seen: set[T], routes: list[list[T]], target: T) -> list[T] | None:
The only difference between the two lines is the addition of [T]
after the function name.
I added some debugging prints to gazelle/python/file_parser.go
:
--- a/gazelle/python/file_parser.go
+++ b/gazelle/python/file_parser.go
@@ -17,6 +17,7 @@ package python
import (
"context"
"fmt"
+ "log"
"os"
"path/filepath"
"strings"
@@ -184,6 +185,10 @@ func (p *FileParser) Parse(ctx context.Context) (*ParserOutput, error) {
if err != nil {
return nil, err
}
+ if strings.Contains(p.relFilepath, "get_deps.py") {
+ log.Printf("after parsing file, got rootNode=%q", rootNode)
+ }
p.output.HasMain = p.parseMain(ctx, rootNode)
And ran gazelle.
# the "good" code logs:
gazelle: after parsing file, got rootNode="(module (comment) (comment) (comment) (expression_statement (string)) (import_statement name: (dotted_name (identifier))) ...
# the "bad" code logs:
gazelle: after parsing file, got rootNode="(ERROR (comment) (comment) (comment) (expression_statement (string)) (import_statement name: (dotted_name (identifier))) ...
So something is up with tree-sitter
where it errors out but doesn't return an err
. The net result of this is that the subsequent call to FileParser.parseMain
fails and the file is incorrectly labeled as a library (instead of a binary).
I think type parameter support was added to the tree-sitter grammer in https://github.com/tree-sitter/tree-sitter-python/commit/71977ea08cd38f28f00edeb3f46b2e881f9c58e0, and should be included in grammer versions 0.20.4 and later. It looks like this was first included in go-tree-sitter in https://github.com/smacker/go-tree-sitter/commit/8516e806aa967773ff7d77f79985741a58462c8e.
@maffoo yes I agree; the 3.12 grammar was added to tree-sitter a while ago. rules_python is currently pinning to the version just before go-tree-sitter
updated the python grammar. rules_python pins 0628b34
and the python grammar was updated in the next commit 8516e80
just like you said.
I tried forking go-tree-sitter and hacking things to use the old scanner.cc
, which doesn't reference ../array.h
, but that naive approach sadly did not work.
As I understand it we are using the standard rules_go
rules to import the go-tree-sitter repository: https://github.com/bazelbuild/rules_python/blob/155efce562f14d46530fb5bec698a11e2ee889f5/gazelle/deps.bzl#L188-L193
And then we use two targets which are created automatically by this import: https://github.com/bazelbuild/rules_python/blob/155efce562f14d46530fb5bec698a11e2ee889f5/gazelle/python/BUILD.bazel#L45-L46
Is it possible to define a more custom way to build that instead of relying on targets generated automatically by rules_go
, since clearly rules_go
is not able to build this package by default? I guess rules_go
does more than just invoking go build
under the hood, but maybe there's a way to customize it a bit, e.g. by telling it that the @com_github_smacker_go_tree_sitter//python
target should include some files from the parent directory.
Actually, looks like the go_repository
rules comes from gazelle itself, not from rules_go
: https://github.com/bazel-contrib/bazel-gazelle/blob/master/repository.md#go_repository. There are a lot of options there, so maybe there's a way to customize it to work with the newer go-tree-sitter.
I checked out the go-tree-sitter
repo and ran gazelle to generate build files as described in the rules_go
docs. To get //python
to build, I added a filegroup in the root BUILD.bazel
:
filegroup(
name = "common",
srcs = ["alloc.c", "alloc.h", "api.h", "array.h"],
visibility = [":__subpackages__"],
)
and then added a reference to this in the go_library
in python/BUILD.bazel
:
go_library(
name = "python",
srcs = [
"//:common",
"binding.go",
...
This was enough to get //python
to build. Given that, it seems like one way to get things working with the new go-tree-sitter would be to add these changes as patches when we bring in go-tree-sitter
as a go_repository
.
Thanks Matthew! My next attempt was going to be to just move/copy those C flies into the python
dir and tweak headers accordingly (also via a patch).
TL;DR: I think we need to bump go-tree-sitter for python 3.12 support.
🐞 bug report
Affected Rule
Gazelle
Is this a regression?
I could argue it either way, lol. I'm going to go with "probably" because it's related to #1895.
Description
PEP 695 added a Type Parameter syntax that looks like:
If a python file uses this new
def func[T](...)
syntax, and the file hasif __name__ == "__main__"
, then Gazelle will incorrectly generate apy_library
target rather thanpy_binary
.🔬 Minimal Reproduction
bazel run //:gazelle
Expected Result
Actual Result
🔥 Exception or Error
None
🌍 Your Environment
Operating System:
gLinux
Output of
bazel version
:Rules_python version:
Anything else relevant?
I haven't been able to find a human-readable changelog of
tree-sitter
orgo-tree-sitter
that explicitly says when it added support for python 3.12...