Closed carljm closed 2 months ago
Name | Link |
---|---|
Latest commit | 5c541dc94240c8f020ea8da3edb0bee0cdaa3fca |
Latest deploy log | https://app.netlify.com/sites/salsa-rs/deploys/66bcc55a4b54c600086ce361 |
Comparing carljm:struct-panic-main
(5c541dc) with master
(08820ea)
✅ 8
untouched benchmarks
Will attempt to read this tomorrow-ish.
I don't have a good repro yet but I just ran into the following error.
2024-08-06 13:52:36.553 [info] 0: red_knot_server::server::Server::run::{{closure}}
at /home/micha/astral/ruff/crates/red_knot_server/src/server.rs:140:29
1: <alloc::boxed::Box<F,A> as core::ops::function::Fn<Args>>::call
at /rustc/051478957371ee0084a7c0913941d2a8c4757bb9/library/alloc/src/boxed.rs:2077:9
2: std::panicking::rust_panic_with_hook
at /rustc/051478957371ee0084a7c0913941d2a8c4757bb9/library/std/src/panicking.rs:799:13
3: std::panicking::begin_panic_handler::{{closure}}
at /rustc/051478957371ee0084a7c0913941d2a8c4757bb9/library/std/src/panicking.rs:656:13
4: std::sys_common::backtrace::__rust_end_short_backtrace
at /rustc/051478957371ee0084a7c0913941d2a8c4757bb9/library/std/src/sys_common/backtrace.rs:171:18
5: rust_begin_unwind
at /rustc/051478957371ee0084a7c0913941d2a8c4757bb9/library/std/src/panicking.rs:652:5
6: core::panicking::panic_fmt
at /rustc/051478957371ee0084a7c0913941d2a8c4757bb9/library/core/src/panicking.rs:72:14
7: salsa::tracked_struct::struct_map::StructMap<C>::get_from_map
at /home/micha/.cargo/git/checkouts/salsa-fa0738f776139e4c/ece083e/src/tracked_struct/struct_map.rs:200:9
8: salsa::tracked_struct::struct_map::StructMap<C>::get
at /home/micha/.cargo/git/checkouts/salsa-fa0738f776139e4c/ece083e/src/tracked_struct/struct_map.rs:177:9
9: salsa::tracked_struct::IngredientImpl<C>::lookup_struct
at /home/micha/.cargo/git/checkouts/salsa-fa0738f776139e4c/ece083e/src/tracked_struct.rs:382:9
10: red_knot_python_semantic::semantic_index::definition::_::<impl salsa::id::LookupId for red_knot_python_semantic::semantic_index::definition::Definition>::lookup_id
at /home/micha/.cargo/git/checkouts/salsa-fa0738f776139e4c/ece083e/components/salsa-macro-rules/src/setup_tracked_struct.rs:144
2024-08-06 13:52:36.553 [info] :21
11: <red_knot_python_semantic::types::infer::infer_definition_types::Configuration_ as salsa::function::Configuration>::id_to_input
at /home/micha/.cargo/git/checkouts/salsa-fa0738f776139e4c/ece083e/components/salsa-macro-rules/src/setup_tracked_fn.rs:196:29
12: salsa::function::execute::<impl salsa::function::IngredientImpl<C>>::execute::{{closure}}
at /home/micha/.cargo/git/checkouts/salsa-fa0738f776139e4c/ece083e/src/function/execute.rs:46:58
13: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
at /rustc/051478957371ee0084a7c0913941d2a8c4757bb9/library/core/src/panic/unwind_safe.rs:272:9
14: std::panicking::try::do_call
at /rustc/051478957371ee0084a7c0913941d2a8c4757bb9/library/std/src/panicking.rs:559:40
15: __rust_try
16: std::panicking::try
at /rustc/051478957371ee0084a7c0913941d2a8c4757bb9/library/std/src/panicking.rs:523:19
17: std::panic::catch_unwind
at /rustc/051478957371ee0084a7c0913941d2a8c4757bb9/library/std/src/panic.rs:149:14
18: salsa::cycle::Cycle::catch
at /home/micha/.cargo/git/checkouts/salsa-fa0738f776139e4c/ece083e/src/cycle.rs:42:15
19: salsa::function::execute::<impl salsa::function::IngredientImpl<C>>::execute
at /home/micha/.cargo/git/checkouts/salsa-fa0738f776139e4c/ece083e/src/function/execute.rs:46:27
20: salsa::function::fetch::<impl salsa::function::IngredientImpl<C>>::fetch_cold
at /home/micha/.cargo/git/checkouts/salsa-fa0738f776139e4c/ece083e/src/function/fetch.rs:104:14
21: salsa::function::fetch::<impl salsa::function::IngredientImpl<C>>::compute_value::{{closure}}
at /home/micha/.cargo/git/checkouts/salsa-fa0738f776139e4c/ece083e/src/function/fetch.rs:46:29
22: core::option::Option<T>::or_else
Which may indicate that we need to validate all tracked structs?
Yeah, that traceback does suggest that there's still a path to accessing an un-validated struct :/
When you say you don't have a good repro, do you mean that you can't reproduce it consistently at all? That's going to make things difficult...
@carljm I haven't spent time on finding a consistent repro. I ran into it when I switched between different files in the LSP
My thought is that we should just update to the most recent revision during the get operation. I think the only way to have an issue is to cheat and leak the data through thread local storage. We do need to update to prevent later writes from occurring in that scenario, however
On Wed, Aug 7, 2024, at 9:02 AM, Niko Matsakis wrote:
@.**** commented on this pull request.
OK, I get the overall bug. Let me think on this.
— Reply to this email directly, view it on GitHub https://github.com/salsa-rs/salsa/pull/550#pullrequestreview-2222698699, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABF4ZUZJEYIJAPF3RSKNITZQGZ6BAVCNFSM6AAAAABL35N3WOVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDEMRSGY4TQNRZHE. You are receiving this because you are subscribed to this thread.Message ID: @.***>
Always updating on the get path makes sense to me and should simplify this PR. I'll update to that approach.
Updated to always validate on get. Some things I'm not sure about here:
1) This means there can be cases (as in the added test) where the same struct is validated on get, and then later re-validated via mark_validated_output
in the same revision. This means validate
can no longer assert that the struct isn't already validated. I'm not sure what the comment "Never update a struct twice in the same revision" was attempting to prevent, but it doesn't seem like setting data.created_at = current_revision
when they are already equal is going to cause a problem?
2) I wondered whether validating on get might mean we no longer need mark_validated_output
on tracked structs at all, but making it a no-op caused another test (deletion-cascade
) to fail with the "access obsolete struct" panic.
3) This change means both get
and update
now need to take &Zalsa
rather than just a current_revision
, so that in get_from_map
we can call zalsa.last_changed_revision
with the durability, which we only have access to after we get the data out of the map.
Test failure looks like a compiler improvement in nightly catching an unmatchable pattern, not related to this PR.
Test failure looks like a compiler improvement in nightly catching an unmatchable pattern, not related to this PR.
I've ignored it; it should be fixed soonish: https://github.com/rust-lang/rust/issues/129031. If you rebase, CI will pass.
I THINK this is fixed by #564
I can confirm this is fixed (https://github.com/astral-sh/ruff/pull/13016/files).
Did we add a test for it?
@MichaReiser I don't think I added a specific test, I'm going to close this PR, but a PR with a test would be great
This adds a test for, and proposes one possible fix for, a bug we ran into with durabilities where we were hitting the "access struct from previous revision" assert.
The test models a situation where we have two File inputs (0, 1), where
File(0)
has LOW durability andFile(1)
has HIGH durability. We can query anindex
for each file, and adefinitions
from that index (just a sub-part of the index), and we caninfer
each file. Theindex
anddefinitions
queries depend only on the File they operate on, but theinfer
query has some other dependencies:infer(0)
depends oninfer(1)
, andinfer(1)
also depends directly onFile(0)
. The dependency graph is shown below: light color represents low durability, black represents high durability, dark gray are structs and struct fields; I didn't get around to adding their durability to the graph yet.(The graph is automatically generated using some hacky printlns I added to Salsa that create a mermaid.live graph; if I end up debugging another issue where it's useful, I'll probably clean this code up and try to make it generally reusable.)
The panic occurs because
definitions(1)
is high durability, and depends onindex(1)
which is also high durability.index(1)
creates the tracked structDefinition(1)
, andinfer(1)
(which is low durability) depends onDefinition.file(1)
.After a change to file 0 (low durability), we only shallowly verify
definitions(1)
-- it doesn't need deep-verify because it is high durability. So that means we never verifyindex(1)
at all (deeply or shallowly), which means we never markDefinition(1)
validated. So when we deep-verifyinfer(1)
, we try to access its dependencyDefinition.file(1)
, and hit the panic because we are accessing a tracked struct that has never been re-validated or re-recreated in R2.The proposed fix is that in
maybe_changed_after
for a tracked struct field, rather than checking that the struct has been created/validated at the current revision, check that it has been created/validated at least as recently as its durability last changed, and if that is older than the current revision, validate it. The logic here is that this tracked struct can't have changed unless some input at or higher than its durability has changed.The other possible fix I considered (but haven't tried implementing yet) is that when we short-circuit a shallow verify due to durability, we need to not only mark our own outputs validated, but recursively find all our inputs that also verify due to durability, and validate all of their outputs as well. That fix seems more complex and expensive than what I have in this PR.