Open TarVK opened 6 months ago
the global entity orchestrating the whole process no longer knows what computed value to add the dependency to.
imo, this is a feature, not a bug or design flaw -- you need some way to choose what signals to entangle with. As in: https://github.com/NullVoxPopuli/signal-utils/
import { Signal } from 'signal-polyfill';
import { signalFunction } from 'signal-utils/async-function';
const url = new Signal.State('...');
const signalResponse = signalFunction(async () => {
const response = await fetch(url.get()); // entangles with `url`
// after an away, you've detatched from the signal-auto-tracking
return response.json();
});
// output: true
// after the fetch finishes
// output: false
<template>
<output>{{signalResponse.isLoading}}</output>
</template>
This is implemented via an impure Computed:
const state = new State(fn);
const computed = new Signal.Computed(() => {
state.retry();
return state;
});
If we always tracked any signal accessed within an async function, we'd more easily have infinite loops -- as any Signal accessed within state
would entangle with along with url
changes -- which would make loading state (or any async state) much harder to handle.
return watch(val1) * watch(val2);
This is an interesting idea, and would indeed be very explicit about what's happening. However, it "feels like" (to me, I could be wrong! and please correct me if I am), that we'd lose autotracking, and the transparent abstractions we can achieve by hiding the underlying Signal API -- for example, app-devs consuming a Signals (as designed today), may not even know they have reactive data! and things would just work! :tada: (provided folks using destructuring in their component renderers do not do so too eagerly -- this is a DevEx thing being explored in frameworks that use functions as components / computeds / async computeds).
For example, if I'm a library author, and I implement a reactive array, such as: https://github.com/NullVoxPopuli/signal-utils/tree/main?tab=readme-ov-file#array, and then pass it to you, you could mutate the array with and my library could just reactive update without you having any knowledge of reactivity -- it behaves just like a native array :tada:
Requiring synchronous code
I believe AsyncContext would make it possible to have Computed
s with async callbacks "just work", but I think a proposed integration between these two proposals should wait until AsyncContext
is (at least) Stage 3.
imo, this is a feature, not a bug or design flaw -- you need some way to choose what signals to entangle with
I'm sorry, but I don't quite follow this statement. Would it not be nicer if after an await, it would still track the following dependencies? Consider for example the following code:
const signalResponse = signalFunction(async () => {
const response = await fetch(url.get()); // entangles with `url`
const json = await response.json();
return json.someField * someSignal.get(); // Should entangle with `someSignal`?
});
Would the current signal implementations throw an error here when accessing someSignal? I don't think so right? Because signal values should also be accessible outside of contexts that register dependencies. So in that case, a developer is likely to think that their computed signalResponse will also update when someSignal changes, but it in fact does not. Again, I don't have enough experience with signals to know this for sure, so you should correct my train of thought if it's wrong, but to me this seems a bit error prone?
we'd lose autotracking, and the transparent abstractions we can achieve by hiding the underlying Signal API
That is indeed correct. I don't think my own system is flawless either, there's just different tradeoffs to consider. These considerations become especially important when making something a real part of the language, which is why I thought I should mention it. I personally have my doubts whether this should be part of the language, but there are already different issues regarding that topic so I won't make any further mention of that here.
provided folks using destructuring in their component renderers do not do so too eagerly
This is actually where my concern lies. A fully automatic transparent system is really nice, if it indeed is fully transparent. However when there are caveats to consider I prefer non-transparent systems and forcing developers to be aware of what they are doing. I am not aware of how many caveats there are with the implicit system, so it might not be that bad, but this is at least part of where my concern comes from.
@TarVK I share your concerns about global state and implicit subscription! And I find your personal ideas very interesting! They are actually quite similar to what we have in VS Code (tutorial here).
Your interface is much simpler, but I wonder how you would support transactions (i.e. updating two source observables at once).
Good to hear I'm not the only one with these concerns @hediet 🙂
Seems like VS code observables (at least on a high-level) function the same as the one in my project, which is nice to see! I chose a slightly different interface, to be more similar to async/await syntax, but they seem to function similarly.
The transactions have also been a concern of mine. I see VS code used:
transaction((tx) => {
observable1.set(6, tx);
...
observable2.set(4, tx);
});
I also considered the interface you chose here, but eventually settled on something different. The main concern here (which you are probably aware of) is that it becomes difficult to abstract this over functions. E.g. if I write some function:
function calculateAndUpdate(tx: ISomeTransactionType) {
const value = ...; // Compute the value according to some logic
observable1.set(value , tx);
}
I now have to add this extra parameter in the signature in order to be able to use it in transactions. I have not found a great solution to this problem myself, but ended up creating a "mutation" type that is returned from updating fields:
function doSomething(): IMutator {
const value = ...; // Compute the value according to some logic
return myField.set(value);
}
function doSomething2(): IMutator {
return synchronized(myField2.set(25), doSomething());
}
The idea is that mutations can be combined, and no reusable function should commit the mutations but should instead return them. Only event handlers would actually commit the final created mutations. Here is a gist of a planning document I wrote in the past, which contains this concept: watchables updating-multiple-fields-without-invalid-states . The main benefit of my approach is that no parameters/types have to be added to function signatures syntactically.
With my approach it however becomes harder to enforce functions to support transactions, which I think would be slightly easier in VS code's approach by restricting how transactions can be obtained while making them required parameters for setters. In addition to this, my approach has a large risk of users forgetting to call commit
, resulting in bugs. I believe the issues in my approach could be mitigated with linters, but this is not ideal. I currently don't have a strong preference for either the mutator or parameter approach, either approach seems to be lacking in my opinion.
It would be nice to hear more opinions on this issue, so I might be able to improve my own system in the future!
I share your concerns, but primarily #.2 I am on the side of "be explicit" (if a value on the list changes then the function runs; end of story)
I do think the proposal should stay as small as possible, so instead of solving the whole async problem, I'm curious what you think about having a warning system to limit the damage of implicit dependencies (which I don't like very much but I think its more likely to get accepted).
I think the biggest footgun of this proposal is using an external value like Math.random() in a conditional that then breaks auto-tracking of dependencies. The silver lining is that its often possible to detect this kind of usage. For example:
new Signal.Computed(()=>{
if (Math.random() > 0.5) {
return person1Name.get()
} else {
return person2Name.get()
}
})
Warnings can be done performantly by checking for a sequence-change in signals.
If person1Name was used on the first run. Then person2Name on the second run, thats a warning because the only way the first evaluated signal can change is if external value is being used as a conditional.
While deeper values are more difficult. Ex:
new Signal.Computed(()=>{
if (randomViewEnabled.get()) {
if (Math.random()>0.5) {
return person1Name.get()
} else {
return person2Name.get()
}
}
})
Its still possible to check "if the value of the first signal value did not change, then the subsequent signal should always be checked next". E.g. if randomViewEnabled
stays true, but person2Name is evaled the first time, and person1Name is evaled the next time, then that is also a warning.
If warnings like this were extremely robust, I think it would massively reduce the foot-gun damage.
But I'm curious how you feel @TarVK (if this would address your # 2 or not)
And in support of your concerns:
Its incredibly useful to say "this computed didnt run, therefore the signals on this explict list have not changed" WITHOUT looking at the contents of the function or the values of stuff like Math.random().
I can see adding some mechanism to
Signal.Computed
to opt into manual dependencies. But I can't see removing auto-tracking.
(^ From a different thread)
I'm not sure I fully understand the following statement, could you elaborate:
this computed didnt run, therefore the signals on this explict list have not changed
I do think that the warning system you propose, especially using auto tracking for warnings combined with explicit dependency tracking, could be very neat. In the explicit tracking system you may forget to watch
a value and in the auto tracking system you may be unaware of whether a value will be reactive or not. So in either case, such a runtime warning system during development could be very valuable!
It does not fully address my concerns though. It's kind of hard to put my concerns into words I suppose, because it's more of a general scepticism of the "unknown". I already know that there are caveats with autotracking, e.g. regarding asynchronous computations. So my stance is more based on being unaware of what other major caveats one might run into (since I don't have enough experience using signals myself), and not knowing how these things will affect new developers. In this case, I prefer a system with a clunkier interface that is more predictable (if I told the system to watch a dependency it will be tracked and otherwise it will not be, end of story). So I do think your suggestion is very nice and addresses one of the known caveats of signals, but it does not address my broader concern related to auto dependency tracking.
I'm not sure I fully understand the following statement, could you elaborate:
this computed didnt run, therefore the signals on this explict list have not changed
Definitely. So, if we have explicit tracking like an array of what to watch:
new Signal.Computed([person1Name, person2Name], ()=>{
console.log("ran")
if (randomViewEnabled.get()) {
if (Math.random()>0.5) {
return person1Name.get()
} else {
return person2Name.get()
}
}
})
If there's no "ran" printed, then that means, for sure, that person1Name and person2Name have not changed. We don't even need to know what the contents of the function are.
It's kind of hard to put my concerns into words I suppose, because it's more of a general scepticism of the "unknown".
That helps. I should word my question a different way though; If there were only these three choices, which would you pick
Definitely. So, if we have explicit tracking like an array of what to watch:
Ember used to have computeds like this, and they were slower and more memory-intensive than auto-tracking. We found (and adapting to the proposal, for demonstration) that this would be a more efficient on memory:
new Signal.Computed(()=>{
console.log("ran")
if (randomViewEnabled.get()) {
let name1 = person1Name.get();
let name2 = person2Name.get();
if (Math.random() > 0.5) {
return name1;
}
return name2;
}
})
and still give you correct behavior with less cpu/memory resource consumption.
Tho, we've found that using a getter / function is faster than managing a cache for something this small:
// no cache present, "just works"
get randomName() {
if (randomViewEnabled.get()) {
let name1 = person1Name.get();
let name2 = person2Name.get();
if (Math.random() > 0.5) {
return name1;
}
return name2;
}
}
as for smaller derived values, there is often the tradeoff of "how much does it take to maintain a cache vs just access the values" - where cache is always going to increase memory consumption, no matter how you do it.
Now, that said, we do still need to have some benchmarking done on this implementation in the proposal, as the results could be different!
Just my 2 cents from one corner of JS <3
I know we have some competing experiences with whether or not explicit deps is faster or not (due to being in different ecosystems / environments), so I think the sooner we can formally test it out, the better!
I do think its a bit unfair to heavily imply "slower", and then kinda backpedal on "smaller derived values" and benchmarking. At the theory level, static is faster for any non-conditional computed values (sorts, hashes, math, substrings, etc). I don't know if that makes up the majority of in-the-wild computed values or not, but I do know its a frequent usecase.
But, its the other case, the has-conditional case, that I find incredible irony with: My prof once said "There is an algorithm that can compute, for any problem(!), the answer in O(1). The downside is the answer is wrong ... but still O(1)!". I would say it looks pretty bad to talk about auto-dependency tracking being faster for the first or second example I gave when the behavior is wrong for both of them.
So speedy that new and senior devs alike shoot themself in the foot without even noticing, is not really an argument in-support-of auto dependency tracking. (Keep the gun, just inform people when its pointing at their foot.)
think the sooner we can formally test it out, the better!
I agree, we can let good benchmarks decide speed, but let us use the (lack of) footguns decide the API
One reason why an explicit list of dependencies might be "slower" is that it wouldn't cater to conditional computeds.
Example:
new Signal.Computed([nameSelector, person1Name, person2Name], ()=>{
console.log("ran")
if (nameSelector.get())
return person1Name.get();
else
return person2Name.get();
})
If nameSelector currently returns true, so the computed returns person1Name, this computed would be re-evaluated whenever person2Name changes, which in its current state is irrelevant.
Auto-tracking would only capture nameSelector and the current name being used, at all times, which are the only changes that should result in re-evaluation.
It wouldn't be hard to consider other examples where more code would be involved which could result in more cpu cycles burning, for no reason.
Right I see @jeff-hykin then we were not fully on the same page. I personally am in favor of explciit dependency tracking in the form I mentioned at the end of my report, something like:
const derived = Signal.Computed(watch=>{
return watch(val1) * watch(val2);
});
Or using an interface more similar to that of vs-code observables:
const derived = Signal.Computed(reader=>{
return val1.get(reader) * val2.get(reader);
});
Most of the benefits you mentioned are lost here, with my approach one:
On the other hand it does have the benefits:
And we keep the shared benefits that both of us mentioned of:
As for your question of the 3 options. As mentioned before I personally think this proposal should be rejected. I think that's outside the scope of this debate, but I will mention my reasons briefly:
So taking that choice out of the question, would I prefer adding signals as proposed vs adding signals with a warning system? I definitely would prefer having one with a warning system. This is a feature I will to some extend now also try to add to my own system, thanks!
Thanks for all the clarifications @TarVK . I think you covered pretty much everything. I also would reject, but originally I would have for different reasons. I would definitely pefer your method over the proposal, but proabably still reject because I think javascript has enough footguns, and signal-based footguns the only ones senior college grads email me about.
Your reasons for rejection are strong points I hadn't considered. Especially because they're kind of beyond the relm of personal preferences. And on that note, I don't think any of our ideas pass those criticisms.
I'll leave this summary for others visiting (correct me if something looks wrong)
How would an invoker "explicitly read" the signals accessed in something like a natively-called generator function?
public *[Symbol.iterator]() {
for (const signal of this.internalSignals) {
yield signal.get();
}
}
@DavidANeil I think that for any explicit system, one makes the distinction between raw return values and "trackable" values. In my implementation those trackable values are IWatchable
s, in VS-code those are IObservable
s, etc. This item contains the actual value that we are interested in, as well as an interface that allows to set up the tracking concerns.
Any function (in this case iterator) should return a trackable value rather than a raw value. Then from within the context of the invoker, both the tracking concerns can be set up and the value can be extracted (in the watchables case using watch
).
For the static list of dependencies approach, I'm not entirely sure how one would deal with a changing list of trackable items. When the list is constant (even if obtained from a generator) I see no problem.
@DavidANeil for implicit reads, I would actually have strong doubts if fields accesses in generator functions are (or rather can be) tracked correctly anyway. Any explicit syntax would give proof that it works/is supported.
For the static list of dependencies approach, I'm not entirely sure how one would deal with a changing list of trackable items. When the list is constant (even if obtained from a generator) I see no problem.
This is a good point, and effectively my next approach wouldn't be a true static list but more of a manually managed list. It could a more refined version of something like:
let comp1 = Signal.Dynamic([sig1, sig2], (triggers)=>{
// triggers is just [sig, sig2]
triggers.push(sig3)
// now if sig3 changes the function will re-run
})
// alternative
comp1.triggers.push(sig3)
// another alternative
let triggers = [sig1, sig2]
let comp2 = Signal.Dynamic(triggers, ()=>{
triggers.push(sig3)
})
triggers.push(sig4)
@TarVK Is your library opensource? It sounds nice, I'd be keen to have a play with it.
Concerns
I don't know if this is an appropriate place to start this discussion, but I've had some concerns about signals for a while now. I've worked on some statemanagement systems in the past myself, and independently came up with something similar to signals. Then when I found out that signals already exist and are semi-standardized, I got quite excited. However, there are 3 major caveats I see with signals that I don't like:
Reliance on globals
In order for computed values to be aware of their dependencies, dependencies have to communicate their presence to computed values. The approach I have seen taken by several libraries to achieve this, is the following: Before evaluating a computed value, it tells some global entity G that it will evaluate. Then the computed value computes and accesses a signal S, during wich S informs G of the access. In turn, G can let the computed value know that it apparently depends on S.
This approach relies on globals, and without special runtime language features, I don't think one can do much better than this approach with the current form factor of signals. The reliance on globals hurts interoptability of different libraries however, similar to the problem that was faced by React hooks. I don't know how this problem was solved in React (because I believe it is now?) so it might be solveable. Additionally if this becomes a core part of JavaScript itself, it might not be an issue to begin with. But I wanted to express my doubts on this either way.
Implicit dependency tracking
Dependency tracking is implicit in signals. This means that when looking at the code for computed values, you can not directly see which of the dependencies can cause recomputations. Now you might argue that you don't have to see it, that it's not relevant, but I think it can be for several reasons:
I like the system that async/await introduces more. Here an explicit 'await' keyword exists, that makes awaiting data explicit. Similarly a 'watch' keyword could exist for computed values, that makes dependency tracking explicit. For async/await it was decided that having an explicit await keyword was beneficial, even though this could also have been implicit. I don't know what the reasons for this choice were, if it was for technical reasons or usage reasons, but either way I believe this to have been the better choice.
Requiring synchronous code
One thing that seems to be lacking from most signal libraries/systems, is asynchronous computations. There may be several reasons for this, potentially including hydration and server side rendering concerns, but besides this the current way of dependency tracking also limits potential async implementations. Ideally, I would want to be able to write something like this:
This would result in a signal that has a default value of 0, and when accessed would perform the computation but without updating its value yet since the computation does not finish synchronously, updating its value and signalling it only afterwards when the computation finishes.
I believe there can be a very nice interplay between asynchronous computations and signals in this way, symplifying how asynchronous computations are used in defining state. The current norm for dependency tracking however falls apart in this use case. If a value is accessed after some delay, the global entity orchestrating the whole process no longer knows what computed value to add the dependency to. This for instance would happen in the following code:
Personal ideas
So let me first mention that I'm aware I'm not as deep into this material as the people working on this repository are, but because I've worked on my own system for a while and have thought about some of these tradeoffs, I did think it was worth reaching out.
All of the issues above were solved in my own system by a simple 'change' (this is actually just how I designed it before I learned about signals) of the interface of computed values. In my system I work with a 2 phase notification, which I believe is similar to what many signal libraries do, where I first mark dependents as dirty before dispatching a change event. As such my signals (which I call "watchables" in my system) have the following interface:
Then, computed values (which I call derived values) are defined using explicit dependency tracking. Consider for example the following fragment:
Here the
watch
function both calls theget
of the watchable, and registers the watchable as a dependency of the derived value. As a result, this system does not rely on globals, allows users to choose whether to actively watch dependencies, and can deal with asynchronous computations. In my own system I go a bit further and also attach metadata to watchables that allow for automatic tracking of additional data like whether the derived value depends on any data that's still loading. But this is probably outside the scope of a standard JavaScript primitive.