bazelbuild / bazel

a fast, scalable, multi-language and extensible build system
https://bazel.build
Apache License 2.0
23.31k stars 4.09k forks source link

query processes cycles without comment #12204

Open benjaminp opened 4 years ago

benjaminp commented 4 years ago

The query reference says that Bazel's query engine is robust to cycles. However, the user is not even notified that they are in unspecified and unreliable territory:

$ touch WORKSPACE
$ cat > BUILD
alias(
   name = 'a',
   actual = ':b',
)

alias(
   name = 'b',
   actual = ':a'
)
$ bazel query 'deps(//:a)'
//:a
//:b
$ echo $?
0

SkyQuery tells largely the same story, though it spices things up with cryptic warnings:

$ bazel query 'deps(//:a)'  --infer_universe_scope --order_output=no 
WARNING: Targets were missing from graph: [//:a]
WARNING: Targets were missing from graph: [//:b]
//:a
//:b
$ echo $?
0

I would expect query to treat cycles as it does other loading errors: report that they occurred to the user and exit nonzero.

ulfjack commented 4 years ago

That is not generally possible the way query is designed, and changing it seems like it might be expensive.

ulfjack commented 4 years ago

Bazel build detects cycles through Skyframe: if there is a cycle then eventually the work queue is empty but some top-level nodes have not been computed. Query is mostly separate from that, and primarily designed for fast DAG traversals. While there are algorithms to detect cycles during a depth-first traversal, they require keeping track of the nodes on the path from a root node. The code does not do that right now, and is also heavily multi-threaded and performance critical.

Note also that query isn't designed to detect cycles in the general case. While deps(x) could do so, multiple application of functions generally doesn't keep track of previously seen nodes or connectivity between them.

benjaminp commented 4 years ago

Right, I was thinking this would be most easy to fix in the case of SkyQuery, where the expectation is of transitive loading of the universe rather than a lazy traversal.

haxorz commented 4 years ago

Non-SkyQuery bazel query erroring out on a target-land cycle used to be a thing, but it was removed in 6596f9c0ed34f3c0b51846eb0c3922bb8a4a001a (cc @juliexxia in case she has thoughts)

Assigning to @alandonovan because that commit was motivated by the following comment by him:

Does blaze query even need to report cycles? I don't see why it should care.

[I wasn't involved with this discussion/commit back then, but I do happen to remember reading the discussion and commit after the fact]

haxorz commented 4 years ago

As for SkyQuery, it'd be pretty easy to have the code report cycles detected during universe loading, but the code simply doesn't do that. The original intent of SkyQuery was for it to be usable on large universes that may have lots of weird stuff going on: target-land cycles, symlink cycles, symlink ancestor infinite expansions, etc. Maybe we should document this intent somewhere.

haxorz commented 4 years ago

I poked around and, "haha", I already debugged this internally in Jan 2019. Alan and Julie, please look at internal bug 122472781. You'll find basically the same complaint that @benjaminp made here, followed by basically the same response from me there that I made here.

haxorz commented 2 years ago

Performance team, in the time since internal issue 122472781 was filed in Jan 2019:

So I wanted to bring this to your attention. When 122472781 was filed I recommended updating the documentation. Does that sound good to you?

github-actions[bot] commented 8 months ago

Thank you for contributing to the Bazel repository! This issue has been marked as stale since it has not had any activity in the last 1+ years. It will be closed in the next 90 days unless any other activity occurs. If you think this issue is still relevant and should stay open, please post any comment here and the issue will no longer be marked as stale.