bazel-contrib / bazel-gazelle

Gazelle is a Bazel build file generator for Bazel projects. It natively supports Go and protobuf, and it may be extended to support new languages and custom rule sets.
Apache License 2.0
1.19k stars 379 forks source link

Directive resolve_regexp returns invalid labels #1965

Open alandonham opened 1 day ago

alandonham commented 1 day ago

What version of gazelle are you using?

master

What version of rules_go are you using?

0.50.1

What version of Bazel are you using?

7.4.0

Does this issue reproduce with the latest releases of all the above?

Yes

What operating system and processor architecture are you using?

MacOS Arm64

What did you do?

Declare a resolve_regexp directive. # gazelle:resolve_regexp py foo\. //x/y/z

Create a Python file with the following import: from foo.bar import baz

What did you expect to see?

deps should resolve to //x/y/z when running Gazelle

What did you see instead?

deps resolves to //x/y/zbar when running Gazelle

This pull request introduces a change that attempts to create a label from the import statement using regex. This is problematic, because if the regex does not match on the entire import statement, it will only replace pieces of the import statement, resulting in an incorrect label that still passes the label.Parse() check and is then written to the BUILD file.

For example, the import statement passed to FindRuleWithOverride() and then o.resolveRegexDep() in the above example is the string foo.bar.

The following is then called inside of resolveRegexDep(): o.ImpRegex.ReplaceAllString("foo.bar", "//x/y/z")

According to the docs:

// ReplaceAllString returns a copy of src, replacing matches of the [Regexp] // with the replacement string repl. // Inside repl, $ signs are interpreted as in [Regexp.Expand].

Thus, we end up with the label //x/y/zbar.

Other regex passed into the same directive has other corner cases that can be hit, and result in various invalid labels.

Creating the label by running ReplaceAllString() appears to contain many corner cases where regex that has been successfully utilized in the past now results in invalid labels.

fmeum commented 1 day ago

It should work with a regex like this: https://github.com/bazel-contrib/bazel-gazelle/blob/2a84bad644e0657346ad75704c2a4835080a3cdf/resolve/resolve_test.go#L30

We could automatically wrap the regex in ^...$ if it isn't already.

fmeum commented 1 day ago

Cc @lkassar-stripe

alandonham commented 1 day ago

It should work with a regex like this:

https://github.com/bazel-contrib/bazel-gazelle/blob/2a84bad644e0657346ad75704c2a4835080a3cdf/resolve/resolve_test.go#L30

We could automatically wrap the regex in ^...$ if it isn't already.

My big concern here is that ReplaceAllString() usage to generate labels is going to be challenging to ensure it's correct. It introduces a way for users to generate invalid and malformed labels without an obvious way to control what is generated. Users will end up needing to reverse engineer what is happening to their labels, as I did, and attempt to manipulate the regex and label string so that it generates a valid label. I think this will make the directive itself much more challenging to use.

Also, changing the behavior of resolve_regexp in a non-backwards compatible way is not ideal. Perhaps it would make more sense to add an additional parameter to enable this regex string replacement or put it behind its own directive?