bazelbuild / bazel-skylib

Common useful functions and rules for Bazel
https://bazel.build/
Apache License 2.0
376 stars 179 forks source link

Add rule for selecting files by extension #507

Closed kellyma2 closed 2 weeks ago

kellyma2 commented 2 months ago

This adds a simple rule for selecting files by extension. Usage is of the form:

    foo(
        name = "my_rule_with_output",
        ...
    )

    select_files_by_extension(
        name = "my_selected_cpp_files",
        srcs = ":my_rule_with_output",
        extensions = [".cpp"],
    )
kellyma2 commented 2 months ago

@brandjon or @tetromino is it possible to get a review on this?

tetromino commented 2 months ago

First, it's not clear to me that this rule will be widely useful. Normally, you'd select a subset of output files by using a provider, not by filtering by extensions. What are situations in which you would want to use this rule, how often do such situations happen, and what are the alternatives? Why is it not possible to add another provider to foo instead?

Are there existing examples of rules equivalent to this one implemented independently in various places?

Second, it's not clear to me that this is the right abstraction. For example, why do you think it's best to filter by extension only, and not by filename?

Please see https://github.com/bazelbuild/bazel-skylib#writing-a-new-module

kellyma2 commented 2 months ago

Replies inline

First, it's not clear to me that this rule will be widely useful. Normally, you'd select a subset of output files by using a provider, not by filtering by extensions. What are situations in which you would want to use this rule, how often do such situations happen, and what are the alternatives? Why is it not possible to add another provider to foo instead?

The main scenario I'm encountering this thusfar involves code generation and RPM packaging via rules_pkg. Specifically, it looks something like:

foo_codegen(
   ...
) 

Where the outputs from this are .cpp and .h files that on their own get fed into a cc_library() rule, which filters the headers and source files sufficiently to work as desired. Subsequently when packaging things into RPMs we are generating Bar-devel RPM that contains only headers. In my understanding of how providers work, we're not going to be able to teach rules_pkg about our custom header-only provider so we need some sort of filtering to happen in the middle. Please feel free to disabuse me of this idea if an alternative is possible. :)

Are there existing examples of rules equivalent to this one implemented independently in various places?

No idea: I was looking at the existing select_file rule and it seemed sufficiently congruent as to be a worthwhile adjacent option. ie: if folks found select_file useful, this probably is too.

Second, it's not clear to me that this is the right abstraction. For example, why do you think it's best to filter by extension only, and not by filename?

File extension is a proxy for file type in many (most?) cases. I could see an argument for filename, but this is more or less what select_file already does, no?

comius commented 2 weeks ago

Subsequently when packaging things into RPMs we are generating Bar-devel RPM that contains only headers. In my understanding of how providers work, we're not going to be able to teach rules_pkg about our custom header-only provider so we need some sort of filtering to happen in the middle. Please feel free to disabuse me of this idea if an alternative is possible. :)

Usually projects that need to pack something together create filegroup rules with globs to select the right files. Globs should already cover all the filtering and file selection needs. I don't think you can even get source files from C++ rules. I would consider this a more "standard" approach.

I'm ruling this rule unneeded. Skylib wants to collect things that are not doable in other ways.

Closing the PR.