bzl-gen-build is a polyglot modular BUILD generator for Bazel, implemented in Rust. It is designed so it is easy to pick and choose components, modify intermediate states (all JSON), and post-process or use the outputs for new purposes as it makes sense.
bzl-gen-build works by running the following phases for each file types:
import
statements and bzl-gen-build directives in commentsSo far we have support for:
See example/
directory for the complete setup.
./build_tools/lang_support/create_lang_build_files/delete_build_files.sh
./build_tools/lang_support/create_lang_build_files/regenerate_protos_build_files.sh
./build_tools/lang_support/create_lang_build_files/regenerate_python_build_files.sh
bazel test ...
These scripts are calling:
GEN_FLAVOR=protos
source "$REPO_ROOT/build_tools/lang_support/create_lang_build_files/bzl_gen_build_common.sh"
run_system_apps "build_tools/lang_support/create_lang_build_files/bazel_${GEN_FLAVOR}_modules.json" \
--no-aggregate-source \
--append
The script is setup to download bzl-gen-build from GitHub and run the bzl-gen-build system-driver-app
with the appropriate commands.
Sometimes we need to help out the bzl-gen-build by supplying directives. For example you might need additional dependencies for tests or macros. We support a few flavors of directives, both in the Module Configuration files and inline in the source code.
// bzl_gen_build:runtime_ref:com.example.something.DefaultSomethingConfiguration
class SomeTest {
}
These are applied locally to the files they are applied against. These can alter the outcome/behavior of what the extract
command above will have produced into the system.
ref
, This adds a reference as if the current file referred to this entityunref
, Remove a reference from this file, the extractor might believe this file depends on this reference, but filter it outdef
, Add a new definition that we can forcibly say comes from this file. Using this can either manually or via another tool allow for the production of new types by either scala macros or java plugins.undef
, Remove a definition from this file so it won't be seen as coming from here in the graphruntime_ref
, Add a new runtime definition, since things only needed at runtime cannot usually be seen from the source code these can help indicate types/sources needed to run this in tests/deployment.runtime_unref
, the dual of the above, though generally not really used oftenThese are used to try to build extra links into the chain of dependencies.
link
, This has the form of connecting one entity to several others. That is if target A
depends on com.foo.Bar
, and a link exists connecting com.foo.Bar
to com.animal.Cat, com.animal.Dog
. Then when we see com.foo.Bar
as a dependency of any target, such as A
, it will act as if it also depends on Cat
and `Dog.These directives are used as late applying commands, they will alter the final printed build file, but not be considered in graph resolution.
manual_runtime_ref
, add a runtime dependency on the target given. That is, not an entity but an actual target addressable in the build.manual_ref
, add a compile-time dependency on the target given. That is, not an entity but an actual target addressable in the build.Today there is only a single form of this, though more though probably should go into this. And if it should merge with the manual directives above. This is used to generate binary targets.
binary_generate: binary_name[@ target_value]
, This will generate a binary called binary_name
, and optionally we pass in some information (such as a jvm class name), to the rule that generates the binary.These run against target languages to generate a set of:
This is an application that runs in multiple modes to try to connect together phases of the pipeline. You can run some, massage/edit/change the data, and run more as it makes sense.
This mode is to prepare the inputs to the system, it will run + cache the outputs of using the extractors mentioned above to pull out the definitions, references, and directives. It can also optionally take a set of generated external files already built of this format - this is mostly used to account for running an external system to figure out 3rdparty definitions/references. (In Bazel, this often would be an aspect).
This is a relatively simple app and maybe should be eliminated in the future. But its goal is to take the outputs from extract
and trim to a smaller number (collapsing up a tree) of files containing just definitions. We do this so in future phases when we need to load everything we can get all our definitions first to trim out all the files as they are being loaded. Scala/Java can have a lot of references as they are often heuristic-based when we have limited insights (wildcard imports).
This system is to resolve all of the links between the graph. This will collapse nodes together which have circular dependencies between them to a common ancestor. The output will contain all of the final nodes, along with which sets of source nodes were collapsed into them, and their dependencies.
This will print out all of the build files, performing any last application of directives as necessary
Each target language is configured using a JSON file, for example build_tools/lang_support/create_lang_build_files/bazel_python_modules.json
:
{
"configurations": {
"python": {
"file_extensions": ["py"],
"build_config": {
"main": {
"headers": [],
"function_name": "py_library",
"target_name_strategy": "source_file_stem"
},
"test": {
"headers": [],
"function_name": "py_test",
"target_name_strategy": "source_file_stem"
}
},
"main_roots": ["src/main/python"],
"test_roots": ["src/test/python"],
"path_directives": []
}
}
}
The above will parse all *.py
files under src/main/python/
and src/test/python/
and generate targets under the directories.
In some situations, like for Protocol Buffer schemas, we want to generate secondary rules per each primary rules. This can be configured as follows:
{
"configurations": {
"protos": {
"file_extensions": [
"proto"
],
"build_config": {
"main": {
"headers": [
{
"load_from": "@rules_proto//proto:defs.bzl",
"load_value": "proto_library"
}
],
"function_name": "proto_library"
},
"secondary_rules": {
"java": {
"headers": [],
"function_name": "java_proto_library",
"extra_key_to_list": {
"deps": [":${name}"]
}
},
"py": {
"headers": [
{
"load_from": "@rules_python//python:proto.bzl",
"load_value": "py_proto_library"
}
],
"function_name": "py_proto_library",
"extra_key_to_list": {
"deps": [":${name}"]
}
}
}
},
"main_roots": ["com"],
"test_roots": [],
"path_directives": []
}
}
}
Extracting definitions from 3rdparty libraries require some setup, also demonstrated in the example/
repo.
GEN_FLAVOR=python
source "$REPO_ROOT/build_tools/lang_support/create_lang_build_files/bzl_gen_build_common.sh"
set -x
bazel query '@pip//...' | grep '@pip' > $TMP_WORKING_STATE/external_targets
CACHE_KEY="$(generate_cache_key $TMP_WORKING_STATE/external_targets $REPO_ROOT/WORKSPACE $REPO_ROOT/requirements_lock_3_9.txt)"
rm -rf $TMP_WORKING_STATE/external_files &> /dev/null || true
# try_fetch_from_remote_cache "remote_python_${CACHE_KEY}"
# if [ ! -d $TMP_WORKING_STATE/external_files ]; then
# log "cache wasn't ready or populated"
bazel run build_tools/bazel_rules/wheel_scanner:py_build_commands -- $TMP_WORKING_STATE/external_targets $TMP_WORKING_STATE/external_targets_commands.sh
chmod +x ${TMP_WORKING_STATE}/external_targets_commands.sh
mkdir -p $TMP_WORKING_STATE/external_files
if [[ -d $TOOLING_WORKING_DIRECTORY ]]; then
BZL_GEN_BUILD_TOOLS_PATH=$TOOLING_WORKING_DIRECTORY ${TMP_WORKING_STATE}/external_targets_commands.sh
else
BZL_GEN_BUILD_TOOLS_PATH=$BZL_BUILD_GEN_TOOLS_LOCAL_PATH ${TMP_WORKING_STATE}/external_targets_commands.sh
fi
# update_remote_cache "remote_python_${CACHE_KEY}"
# fi
The above calls py_build_commands
to generate a Bash script, which then runs wheel_sanner on all the PIP libraries, and generates JSON files in a temp directory. In an actual usage, you would cache these based on the $CACHE_KEY
so this is done once per dependency update.