Closed boutell closed 3 years ago
@boutell I totally agree with your conclusion! As I read through the 1
s, option d
was the option that didn't seem so bad. And maybe longer term, option 3
is the best choice, but perhaps 1d
is good enough, and you never need to go down the path of implementing option 3
.
One other bit of feedback. The messaging on 1d
should point the developer to the offending widget module, if possible. E.g.
Widget loader recursion limit reached. Use the `neverLoad`
option on the "article" widget module because it has relationships that
point back to the original "article" widget document.
..and maybe a link to apos.dev/recursions
in the warning feedback with more details about the issue.
1d
looks much cleaner and intuitively makes more sense to me. I would go with that one
Bob, I would love to do that, like love in serious amounts, but I don't want to overpromise because I don't know if I'll always be able to figure out the chain of events or not. I think it's... possiblish!
On Fri, Sep 18, 2020 at 11:37 AM notebk notifications@github.com wrote:
1d looks much cleaner and intuitively makes more sense to me. I would go with that one
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/apostrophecms/apostrophe/issues/2406#issuecomment-694939059, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAH27ME3MED3LRXEGGJ6LTSGN5DZANCNFSM4RSDK2VA .
--
THOMAS BOUTELL | CHIEF TECHNOLOGY OFFICER APOSTROPHECMS | apostrophecms.com | he/him/his
I agree with the general consensus. 1d and 1b were sounding good to me. 3 would be sweet as well if we could.
In particular, the error message in 1d was quite nice. That goes also to @bobclewell's comment about including the widget type in the error message. I realize that seems like a stretch goal, but I think it has to be almost a requirement of this. The WORST thing about the recursion error is that we have no indication of where the hell it is coming from. That leads to adding joins to everything you can find, but sometimes (in 2.x) that didn't do the trick, so you're left banging your head against the desk.
We need to tell folks where the problem is. In 2.x the hidden issue often was from pieces widgets where the join isn't visible. In that case the error should be even more specific, saying that this is from a join you don't have in your project code.
So my two big cents are that the solution has to not only be about prevention, but also about increasing the clarity and helpfulness of the error 100-fold.
I hear you, Bea! And I want to do that, if I can figure out how to reliably do that.
However, even if we're unable to do that, this bit should eliminate most cases:
"And, loadSelf should be an option too, that defaults to false. Then an article widget would never load another article widget. No runaway recursion when A recommends B recommends A."
I think we've arrived at a plan to pursue 1d, with a focus on making that error message as informative as humanly possible.
Is it hard to implement alternative 3 right now?
Might not be, but definitely not before beta.
@boutell
I think the main issue I had while trying to resolve all the recursion warnings was the error message itself. I don't think it is enough. Yes, it is helpful to show the doc _id but it would be very very helpful to show the stack trace if possible. This way we won't have to try to figure out which join or which widget loader threw that warning and on that note the project I am referring to here had so many pieces and widgets. I wasn't able to replicate the warnings for every use case. It can also be very overwhelming to find the source trace of the issue without stack trace or at minimum show more info on the triggers of the error message.
With regards to which alternative is better I think we can't really rely on lazy loading as you mentioned since we don't necessarily know before hand which widgets will be added to the page but in my opinion it would be great if you can give the developer more SAFE choices while as you mentioned many developers including myself won't always know many of the core behind scenes code so education is the key.
Thank you for creating this thread and for the efforts
I wish it were as simple as a stack trace! Async does complicate things.
I have ideas toward how to make this work and bring the name of the "guilty" widget or widgets causing the loop into the spotlight.
On Tue, Sep 22, 2020 at 7:04 PM Omar notifications@github.com wrote:
@boutell https://github.com/boutell
I think the main issue I had while trying to resolve all the recursion warnings was the error message itself. I don't think it is enough. Yes, it is helpful to show the doc _id but it would be very very helpful to show the stack trace if possible. This way we won't have to try to figure out which join or which widget loader threw that warning and on that note the project I am referring to here had so many pieces and widgets. I wasn't able to replicate the warnings for every use case. It can also be very overwhelming to find the source trace of the issue without stack trace or at minimum show more info on the triggers of the error message
With regards to which alternative is better I think we can't really rely on lazy loading as you mentioned since we don't necessarily know before hand which widgets will be added to the page but in my opinion it would be great if you can give the developer more SAFE choices while as you mentioned many developers including myself won't always know many of the core behind scenes code so education is the key
Thank you for creating this thread and for the efforts
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/apostrophecms/apostrophe/issues/2406#issuecomment-697027617, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAH27LBISOJXYPOUMTCXLLSHEURJANCNFSM4RSDK2VA .
--
THOMAS BOUTELL | CHIEF TECHNOLOGY OFFICER APOSTROPHECMS | apostrophecms.com | he/him/his
It is possible that improving the error message to include a better diagnosis is the only really absolutely necessary improvement.
On Wed, Sep 23, 2020 at 8:04 AM Tom Boutell tom@apostrophecms.com wrote:
I wish it were as simple as a stack trace! Async does complicate things.
I have ideas toward how to make this work and bring the name of the "guilty" widget or widgets causing the loop into the spotlight.
On Tue, Sep 22, 2020 at 7:04 PM Omar notifications@github.com wrote:
@boutell https://github.com/boutell
I think the main issue I had while trying to resolve all the recursion warnings was the error message itself. I don't think it is enough. Yes, it is helpful to show the doc _id but it would be very very helpful to show the stack trace if possible. This way we won't have to try to figure out which join or which widget loader threw that warning and on that note the project I am referring to here had so many pieces and widgets. I wasn't able to replicate the warnings for every use case. It can also be very overwhelming to find the source trace of the issue without stack trace or at minimum show more info on the triggers of the error message
With regards to which alternative is better I think we can't really rely on lazy loading as you mentioned since we don't necessarily know before hand which widgets will be added to the page but in my opinion it would be great if you can give the developer more SAFE choices while as you mentioned many developers including myself won't always know many of the core behind scenes code so education is the key
Thank you for creating this thread and for the efforts
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/apostrophecms/apostrophe/issues/2406#issuecomment-697027617, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAH27LBISOJXYPOUMTCXLLSHEURJANCNFSM4RSDK2VA .
--
THOMAS BOUTELL | CHIEF TECHNOLOGY OFFICER APOSTROPHECMS | apostrophecms.com | he/him/his
--
THOMAS BOUTELL | CHIEF TECHNOLOGY OFFICER APOSTROPHECMS | apostrophecms.com | he/him/his
Have you ever considered a possible maxDepth
(widget) option to limit the number of allowed recursions?
It sounds intuitive and easy to understand solution to me. It should not interfere with whatever approach you choose to fight this (brain exploding) problem.
This is what the current behavior is, pretty much, but limiting it that way feels like we're giving up on helping developers solve the real problem of asking only for what they need in a way that isn't too hard to understand.
On Fri, Oct 30, 2020 at 3:17 PM Miro Yovchev notifications@github.com wrote:
Have you ever considered a possible maxDepth (widget) option to limit the number of allowed recursions? It sounds intuitive and easy to understand solution to me. It should not interfere with whatever approach you choose to fight this (brain exploding) problem.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/apostrophecms/apostrophe/issues/2406#issuecomment-719747441, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAH27J6AVWD6VPW2U3R5KLSNMGMHANCNFSM4RSDK2VA .
--
THOMAS BOUTELL | CHIEF TECHNOLOGY OFFICER APOSTROPHECMS | apostrophecms.com | he/him/his
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This ain't stale, it's just marinating.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
@boutell
I believe this thicket is not resolved yet and you already closed it. Can you please confirm?
Thanks,
It's resolved in A3. For A2, for bc reasons, the current implementation is about as good as it can be.
Please see:
As promised I've spent some time thinking about the maximum area loader recursion level message and what we can do about it. Here are some proposals. It's a bit of a long read, but then it's a problem nobody likes, so I think it's worth your time and mine to talk it through.
Understanding the problem
The problem is infinite recursion.
A page can have relationships (joins) in its schema. It can also have widgets, which might contain relationships.
If page X has a relationship to page Y which has a relationship back to page X, we get an infinite loop, which would take the site down.
How we cope with this today
Right now we cope with this in two ways.
Relationships in the schema of the page
For relationships in the schema of the page, we fetch the related documents, but we don’t fetch any of their relationships, unless the developer explicitly requests it in the field definition using withRelationships (it was withJoins in 2.x). The end. This works well.
Relationships in widgets
But for widgets in the page that have their own relationships, the cycle begins again, because the widget loaders are called independently. A widget in page X indirectly causes page Y to be fetched. Page Y thinks it is just as important as page X and goes to load all of its own widgets. One of them points back to page X.
So Apostrophe copes by tracking how many times widget loaders have been called on page X. If that number hits 5, it refuses to call the loader yet again, and prints the warning, which tries to tell the developer what to do: add projections to all relationships in custom widgets, and, crucially, to all pieces widgets.
Problems with the current solution
Nobody loves this solution. The site is already slow by the time we hit that doc for the fifth time. The developer doesn’t always understand the message. The site is fast enough to slide by in testing but too slow when that newsletter goes out and it gets real traffic. The client is unhappy.
Plus there are occasional legitimate reasons why you might load a page or piece 5 times in a single request, especially if it’s part of a command line task. It’s rare but it does come up and in this situation you have to use workarounds like using .areas(false) to prevent the relationships of widgets from being loaded. (This is never a bad idea when you’re not interested in those relationships, but it’s still more to think about.)
Alternatives
Alternative 1a: restrictive defaults with the areas option
The #1 cause of this message is pieces-widgets where the piece points back to another piece of the same kind, directly or indirectly, and the cycle goes on forever.
We could fix it by changing pieces-widgets like this:
By default, pieces-widgets call .areas(false) when finding pieces.
If you wanted them, you would need to configure the module to call widget loaders, similar to how relationships have to be explicitly configured to load more relationships:
This would work pretty well. We would never have runaway recursion due to piece widgets unless someone explicitly opted into loading an area that contained widgets that pointed back to something previously loaded.
But, it would also be frustrating, because the number one use of piece widgets is to load an article, and the developer frequently wants to display an image that is included in the article via an image-widget, which contains a relationship to an image. By default, this would not load, and the developer would not understand why. I also don’t see a great way for us to give them a hint.
And, it wouldn’t always work. If you write areas: [ 'body' ], and the body area sometimes contains another product-widget so the product can point to a related one, which relates back to the first one… game over.
Alternative 1b: restrictive defaults with the widgetLoaders option
This could be better:
Now we’re saying that after loading product pieces, this widget only calls loaders on widgets of type @apostrophecms/image. This is more robust — we’re not declaring an entire area “safe,” we’re calling out a specific type of widget we know is safe because its piece type typically has no further relationships or widgets.
Alternative 1c: restrictive defaults with widgetLoaders and safe
OK, a bunch going on here:
widgetLoaders
becomes a cascade so we can use add and remove in subclasses.Cascades get a minor new feature: they can be used for simple arrays of strings (this is new), as well as for objects (what we’ve seen before).
The widgetLoaders list can have a special entry, $safe. It’s there by default. If $safe is present in widgetLoaders, then all widgets that have the safe option have their loaders called. We tag simple, never-recursive widgets like @apostrophecms/image-widget as safe.
This works really well in a lot of ways. We can keep developers out of trouble most of the time, we can avoid frustration most of the time. But the warning must still exist because the behavior can still happen. Developers might start marking everything safe even if it’s not, because they don’t understand what’s going on.
But, this does require the developer to be really explicit and it could be perceived as "just broken" if the developer doesn't take the time to understand what they have to do, and they can't render the image they wanted.
We could mitigate this developer frustration if
{% area %}
and{% widget %}
paid attention to whether widgets had actually been loaded, and dropped a warning on the developer if they hadn't. But this could be tedious if the widgets in question don't even need loaders (think a widget with a simple schema of string fields).We could let developers type:
{% area data.page, 'body', { unloadedOk: true } %}
And we could do something similar in a warning that
{{ apos.image.first }}
prints if it can't find anything. But again... sometimes you expect it to not always find something and this is annoying.We could let developers type:
{{ apos.image.first(data.page.thumbnail, { missingOk: true }) }}
Alternative 1d: opt-out loading
Instead of configuring all the widget types that a widget SHOULD recursively load, you configure all those that it SHOULDN'T. Most widget types don't recurse, so why punish all of them?
And,
loadSelf
should be an option too, that defaults tofalse
. Then anarticle
widget would never load anotherarticle
widget. No runaway recursion when A recommends B recommends A.So, runaway recursion would still be possible, but quite rare. And if it did happen, the message could be:
Alternative 2: lazy loading
This is a bit more radical, maybe a beta or 4.0 thing if it’s even a good idea.
The underlying idea here is: premature optimization is the root of all evil.
To put it another way: we wouldn’t be in this mess if we weren’t trying to load everything before we render the page template, and find out if we want any of it.
So we make a simple, but far-reaching change:
The load methods of widgets are not called at all when the page loads. They get called only when that widget is put on the page with {% area %} or {% singleton %}. Those are async, so it is possible to call the loaders at this point. Now runaway recursion is impossible unless the widget template itself actually outputs the nested widgets.
This is really tempting. It’s easy to reason about, it has some efficiencies by default (just like alternative 1), and we can expose the loaders to data passed by the template (a common feature request).
But, it has some downsides:
There’s a performance impact. Right now, if there are 30 image widgets on a page, which is not unusual, we fetch all of the images with one database call. With the lazy loading inside the template, this is not possible because we don’t know we want it until we want it, and we can’t continue until we render it. This is a known production issue, we didn’t always have the “load them in one call” feature in early versions of Apostrophe and we had to add it to complete certain projects. It doesn’t always solve the problem. If an article widget outputs the body area of an article, and that area contains another article widget pointing to a related article, an infinite loop can still result. It’s just an infinite loop that passes through the template logic.
A possible solution to the performance impact is to allow the developer to optionally configure
preload: true
for a widget type. This would let the developer say “this one doesn’t do recursive things and doesn’t need any data from the template, so just load them all at once before rendering.” We could put it on @apostrophecms/image-widget which would help with a lot of cases.But as for the second problem, I don’t think it has a solution, other than developer education. So on this point it is not as good a solution as Alternative 1c or maybe Alternative 1d.
Alternative 3: optional lazy loading
One more idea: we could do the lazy loading thing, but only for widgets that ask for it.
So if you created a recursion bug situation, you'd still get the warning, but it would say:
If you set
lazy: true
, that widget type would always load lazily (when actually shown in a template), but all of the others would stay fast.Concluding thoughts (input welcome!)
After dwelling deeply on the problem, I'm pretty torn. Alternatives 1d (opt-out for widgetLoaders, plus a widget loader never calls itself by default) 3 (optional lazy loading for widgets that actually cause this problem) are both good.
But if we're going to consider this in 3.x I think
1d
is the way to go. There's much less impact on most developers, and much less development and performance effort than adding lazy loading.So... thank you for reading! You are very 💯 for getting this far. What are your thoughts?
We might not have time for this before beta or even in 3.0, but it’s important to wrap our heads around it or we’ll be stuck with the status quo forever.