Limit DebugWithDb impls to avoid generating false dependencies

nikomatsakis commented 2 years ago

Hmm, so @zjp-CN added DebugWithDb impls in #369, but I'm beginning to think it might be good to make them a bit more limited.

The current impls for tracked structs, for example, dump the values of all fields. But this means that salsa will consider the function that uses them to read all the tracked fields, which means the function will re-execute more than you might like. That seems bad, and also likely to cause "heisen-bugs", where adding a {x:?} print-out to a function changes the behavior. It also seems like it might cause problems for #386 or other attempts to debug salsa's behavior.

An alternative is to limit the fields that get printed -- for example, we could only print out the #[id] fields for a tracked struct, since those are part of its identity. Following this model, for interned structs, we can print all the fields, but for inputs we would print only the salsa-id.

The part about inputs bugs me a bit, I feel like it'd be convenient if there were some way to give inputs a "name" that will be printed for them (e.g., to designate some fields as being printable).

Another option would be to add some form of "opt-in", e.g. by adding a method like debug_all in the DebugWithDb trait that dumps out all the fields (and a corresponding fmt_all method that gets invoked; it might, by default, simply call fmt).

Finally, I think it would be nice to let users disable or control the automatic DebugWithDb impl.

This is still a "bikeshed" issue because I don't have an exact proposal here, though I guess I've sketched out a few things:

Default behavior of debug impl should only print identity fields: #[id] fields for tracked structs, all fields for interned structs, and no fields for input structs.
There should be a way to print all fields (e.g., debug_all), we can easily add this to the DebugWithDb trait.
We should have a way to disable the default DebugInDb impl and/or customize what fields get printed by default.

vemoo commented 2 years ago

When I tried to update https://github.com/dada-lang/dada to latest salsa-2022 I run against this issue, so I'd like to implement it.

To disable the DebugInDb the idea I had is to add an no_debug_with_db option to use like #[salsa::interned(no_debug_with_db)].

nikomatsakis commented 2 years ago

@vemoo yeah, that was part of why I noticed it too :)

In terms of deciding whether or not to derive DebugWithDb at all, the other way to do it would be to disable the automatic DebugWithDb impls and instead search for something like

#[salsa::input]
#[derive(DebugWithDb)]

...that seems more consistent with Rust, but also kind of annoying.

vemoo commented 2 years ago

It does look better. But it must always be after #[salsa::input], otherwise it won't be seen by the macro, and if someone puts it before by mistake the error message will be confusing. I think we should also validate that it's the only derive, because it's not clear what it should mean adding other derives there.

vemoo commented 2 years ago

For opt-in another option could be #[salsa::input(derive(DebugWithDb))]

PoignardAzur commented 1 year ago

Noob question: if there a reason that salsa should track calls to DebugWithDb?

Presumably most uses of that trait is in println! macros that you use to printf-debug your code, and don't have any impact on the return value, right?

So one extreme solution would be to have the generated DebugWithDb implementation use secret undocumented functions that get the field values but tells salsa "it's okay, I'm not going to actually use this, don't register a dependency for me". Though of course that would be pretty dangerous, because nothing stops a user from storing the result in a string and returning the string.

On the other hand... the user can already break query purity if they really want to, so maybe this shouldn't be a strong concern? I feel like there's a middle ground where salsa provides some escape hatches that you can use to break incremental compilations if you really want to, but you aren't likely to do so by accident, and DebugWithDb uses these escape hatches.

PoignardAzur commented 1 year ago

Thinking out loud: another way to achieve this could be to have some sort of "defer" method, eg:

db.defer(|db| {
    eprintln!("my_value = {:?}", my_value.debug(db));
    eprintln!("my_foo = {:?}", my_value.foo(db));
});

Because the code would be guaranteed to run after the query, salsa would know it doesn't influence the query's results.

PoignardAzur commented 1 year ago

Thinking some more on the subject: you don't even need to defer calls. Db could just take a non-mutable Fn and assume that no information is leaked (which should be correct as long as the user doesn't cheat by using interior mutability).

unbyte commented 1 year ago

In frontend world, reactive frameworks like vue provide functions to pause and resume tracking so users can easily determine their own boundary of tracking. So instead of providing something like xxx_debug or debug_yyy, how about adding some helper functions and a macro to let users make a block of code untracked by the runtime? Considering that frontend reactive framworks are working on single-thread model but the situation of salsa is different, then the helper function may like:

{
    let untracked_db = db.tracking_free();
    eprintln!("{:?}", value.field(untracked_db));
}

And we can have a macro

tracking_free!(db => {
    eprintln!("{:?}", value.field(db));
});

expanded to

{
    let db = db.tracking_free();
    eprintln!("{:?}", value.field(db));
};

vemoo commented 1 year ago

I also suggested a way to read values without tracking on #411

PoignardAzur commented 1 year ago

Maybe we should centralize these discussions in a single thread. There are quite a few open issues on the whole "DebugWithImpl false dependencies" issue, it would be nice to have an overview of the subproblems and potential solutions.

salsa-rs / salsa

Limit DebugWithDb impls to avoid generating false dependencies #397