ionic-team / stencil

A toolchain for building scalable, enterprise-ready component systems on top of TypeScript and Web Component standards. Stencil components can be distributed natively to React, Angular, Vue, and traditional web developers from a single, framework-agnostic codebase.
https://stenciljs.com
Other
12.47k stars 784 forks source link

bug: Potential Memory/Resource Leak #3607

Open joewoodhouse opened 1 year ago

joewoodhouse commented 1 year ago

Prerequisites

Stencil Version

2.18.0

Current Behavior

I am investigating some behaviour I'm seeing in my application (an Ionic application using StencilJS as the framework/runtime) where memory and DOM nodes appear to be leaking. This is occurring when modals are being opened and then dismissed.

I have created a minimal reproduction which you can find here: https://rococo-creponne-086676.netlify.app/

Code here: https://github.com/joewoodhouse/stencil-modal-memory-investigation

The reproduction site basically offers two buttons - one that demonstrates the resource leak, and one that demonstrates non-leak behaviour. Both buttons open a modal via the modalController.create method, then immediately dismiss it, and repeat. The component that is created in the modal (app-home) takes one prop, which alters the content it displays. When the "Run Broken" button is pressed, the app-home component renders an <ion-list/> in it's element. When "Run Working" is pressed, the app-home component uses a <div/> instead of an <ion-list/>. The use of div vs ion-list is the only difference (from a user perspective) between the two scenarios.

Once you have pressed "Run Broken" you can see the resource leak via DevTools. Using the Performance Monitor you can see the number of DOM nodes ever increasing, and any attempt to manually trigger a garbage collection does not bring them down. Where as when you choose "Run Working" then the number of DOM nodes will remain stable and triggering a manual GC will immediately flush it back to starting levels.

Taking a Heap Snapshot also shows the leak, where if you inspect the snapshot you will see 100s of Detached elements (ion-modal) instances in the heap.

Tested in Chrome 105.0

I have tried and failed to recreate this within the Ionic source code - my original hunch was that is was some leak in the modal implementation, or perhaps even the ion-list element. But I could not recreate it there which made me believe that it must be more down to the Stencil runtime. Whenever I'm exploring the heap snapshots I come to a dead end at $lazyInstance$ which I think is part of the Stencil runtime, but I am not an expert and couldn't get further.

This is a real issue for us that our users are experiencing - I hope that my code and reproductions are useful and if you need any more info please let me know.

Expected Behavior

Resources (heap memory and DOM nodes) should not leak when opening and closing modals.

Steps to Reproduce

Use the reproduction site (either directly here https://rococo-creponne-086676.netlify.app/ or the code is here https://github.com/joewoodhouse/stencil-modal-memory-investigation

Press "Run Broken"

Monitor DOM nodes and Heap memory

Note: The modal is only visible at a desktop viewport size, I couldn't work out the Ionic reason why

Code Reproduction URL

https://github.com/joewoodhouse/stencil-modal-memory-investigation https://rococo-creponne-086676.netlify.app/

Additional Information

Using latest Ionic as well (6.2.7)

joewoodhouse commented 1 year ago

I've now refined my reproduction, though I'm not closer to being able to explain what's going on. I've pushed my changes to my repo and Netlify link.

The core seem to be:

With the above set of conditions, the leak occurs. If the class list of the <my-list/> component does not contain list-md then the leak does not occur. Equally, if the <ion-list> on the main page is removed, the leak also does not occur.

So the leak appears to be due to the interaction between the <ion-list> and the <my-list> elements, it seems through their CSS. This is where my understanding breaks down unfortunately.

My next step will be to try and remove the modal element completely from the reproduction to narrow things down further.

tanner-reits commented 1 year ago

@joewoodhouse Thanks for reporting this issue and providing a replication site/repo/steps! Reproduction cases are super helpful for us as we try to pinpoint where some of these obscure issues lie.

That being said, this one could be a doozy. I was able to verify the issue with nodes being infinitely created on your reproduction url. Looks like some garbage collection process is getting hindered. I'll get this one labeled appropriately so it gets pulled into our backlog for work in a future sprint! Thanks for your patience and we'll post any updates here!

vaibhavshn commented 1 year ago

I've been debugging this issue.

Only place i could find which still references a node which has been unmounted is the hostRefs weakmap. Although the $hostElement$ and the $lazyInstance are set as key in this weakmap, there are still references to this key in the value object.

And as per MDN:

A WeakMap is a collection of key/value pairs whose keys must be objects, with values of any arbitrary JavaScript type, and which does not create strong references to its keys.

This might be preventing the nodes from being garbage collected.

In the following image, my-test has been removed from DOM, but the node still exists in the value and key.

image

code where this is set: https://github.com/ionic-team/stencil/blob/66ce0d330b296ad8ac1ef7ebb5732816af5a6244/src/client/client-host-ref.ts#L32

also, even state values, reference to elements are also not cleaned up because of this.

Solution would be to remove the references to the key from the values, and come up with another way to handle it.

vaibhavshn commented 1 year ago

We've seen this memory leak affect our application when we've unmounted a lot of elements, especially things such as tooltips, chat messages are used extensively throughout the app.

joewoodhouse commented 1 year ago

@vaibhavshn Nice work, let me know if you need any help. @tanner-reits has there been any internal investigation on this one? Still causing us problems

alicewriteswrongs commented 1 year ago

@vaibhavshn thanks for digging into this! It looks like we've got a circular reference between the key and the value within the WeakMap which would I think prevent the key from ever being garbage collected, since the reference to the key held by the value (under $hostElement$) is going to be a strong reference :( so even if it's not referenced anywhere else (e.g. it has been removed from the DOM) I think the garbage collector will continue to skip on by and leave it hanging around since it sees that strong reference.

Thank you again for digging into this and finding some key clues! Such things are tricky, and your work will definitely help address the issue. So thanks again! 🔎🧐👍💯

I think I will have time fairly soon to look more into what we can do to address the issue. At minimum, we need to not have a reference to the key in the WeakMap from the value, but I'll need to explore a little bit to see how exactly we can get rid of this. Possibly we can change the reference in the value object to be a WeakRef...

Thanks again!

vaibhavshn commented 1 year ago

Sure @alicewriteswrongs, do let me know of any timeline you have on the fix. And do let me know if you need any help from me too.

raymondboswel commented 1 year ago

I have another example of a memory leak, here: https://github.com/raymondboswel/stencil-redux-example. The non leak version uses RxJS for state management and the leaky version uses stencil/store, which is the only difference between the leaky and non-leaky versions. Should I create another issue (perhaps on stencil/store?) or is it possibly related to this leak?

IndyVC commented 1 year ago

Hi all, we noticed something similar: Using stencil for our components, we noticed a lot of detached elements (which are all created in stencil, used in a react application with the stencil react wrapper). We got performance issues so started looking around with chrome dev tools, apparantly we stored a shitload of elements that are not drawn on the screen anymore.

If this could be of any help: image

@raymondboswel we have components that do not import anything from stencil/store, so I'assuming its not stencil/store related.

alicewriteswrongs commented 1 year ago

@IndyVC thanks for more info - always a help in tracking things like this down! At present I know there is definitely some memory leakage happening as a result of how Stencil's virtual DOM implementation holds on to references to DOM nodes. In a lot of situations we create objects holding metadata related to how and where a component has been rendered to the DOM, and often these have references to DOM nodes (but they aren't cleaned up well by the Stencil runtime).

I did some work replacing those normal references with WeakRefs but I wasn't able to completely address the memory issue by doing so. At present we're working on getting Stencil v3 out the door, and once we do that should free up some time for the team to look into issues like this one.

I have not yet investigated the memory leak which happens when using stencil-store, so I can't say right now whether or not it's related - thank you for filing an issue on the repo and providing a reproduction case though!

joewoodhouse commented 1 year ago

@alicewriteswrongs any update on this issue? This is basically making our production app unusable at the moment as the references are leaking so much that the app eventually crashes (we're using this for a hybrid mobile app with Ionic and vanilla JS). Any suggestions of workarounds or any help I can give let me know.

gabides commented 1 year ago

Hi @alicewriteswrongs any news on addressing this issue? We are having memory leak issues in our app due to this, and it is causing some crashes in our app

ghaiat commented 1 year ago

hey, any update on this ? it s pretty big issue

alicewriteswrongs commented 1 year ago

Sorry to report that there's no news on this front - rest assured that I will share updates when I have them!

ghaiat commented 1 year ago

@alicewriteswrongs does this affect all stencil components? only the ones with @State? or something else?

is there any workaround?

our application is crashing a lot because of this and we need to find a solution for our clients asap ....

denyo commented 1 year ago

We're also waiting for this to be fixed. We want to use WeakMaps with a component's host element as a key that we don't have to clear manually once the component disconnects. However, with the current behavior the maps just keep growing since the references are never garbage collected.

joewoodhouse commented 1 year ago

@ghaiat from my investigation this impacts all components

@alicewriteswrongs appreciate you keeping us in the loop. I would say, to me this seems like a really massive issue, and it impacts my work building and maintaining Ionic apps on an almost daily basis. And from the above seems like others do too. Since I reported this there have been several releases of Stencil (including a major release) that in my opinion contain a number of "minor" features compared to this. I'd be really keen to hear that this issue is being taken seriously and is on the roadmap for resolution soon.

ghaiat commented 1 year ago

@joewoodhouse i agree 100%

@alicewriteswrongs can you explain how an issue as big as is, causing perf/crash issues for anyone using stencil has not been moving for 9 months, this is insane

alicewriteswrongs commented 1 year ago

hey y'all, let's keep it civil, I just work here 🙂 @ghaiat the short answer, if you're curious, is that the people who wrote Stencil originally no longer work on it so bugs like this require a significant amount of time for us to root-cause and fix. There's no conspiracy to keep memory leaks around, just a small team with limited resources and a series of priorities.

I am looking into this now actually (as in spending time on it today) and I will definitely, 100%, I promise let you all know when I have something to share. At present I can't really say more than that.

capc0 commented 11 months ago

sorry to ask, but it has been a while. Any known workarounds for this issue? Can we provide more data or help otherwise?

capc0 commented 11 months ago

a possible solution without breaking the current WeakMap logic could be to have a method unregisterHost which removes the element from the hostsRef WeakMap. https://github.com/ionic-team/stencil/blob/66ce0d330b296ad8ac1ef7ebb5732816af5a6244/src/client/client-host-ref.ts#L6 However, there must be some core callback where that function would be called. My guess we be somewhere around here in the disconnectedCallback: https://github.com/ionic-team/stencil/blob/66ce0d330b296ad8ac1ef7ebb5732816af5a6244/src/runtime/bootstrap-lazy.ts#L128

conceptree commented 10 months ago

Hello @alicewriteswrongs ,

This is a big one and feels like it should be a top priority. Is there any perspective of a delivery date for this fix?

Thanks in advance.

pm0u commented 7 months ago

Any updates here? Running into some issues that we think could be caused by this

rwaskiewicz commented 7 months ago

Hey folks,

I don't have an update on this particular issue at this time.

Please do me a favor and add 👍's to the issue summary to upvote it - that's the best way for the team to track issues that are affecting projects. GitHub doesn't give us an easy way to track other types of comments, which makes them more likely to not be properly counted when we prioritize issues.

Thanks!

adbetin commented 5 months ago

Any updates here? I'm facing the same performance issue related to DOM detached elements that are referenced by a WeakMap. Any suggested date or workaround?

aeharding commented 5 months ago

Any updates here? I'm facing the same performance issue related to DOM detached elements that are referenced by a WeakMap. Any suggested date or workaround?

Vote on the issue please don't spam.

denyo commented 5 months ago

Any updates here? I'm facing the same performance issue related to DOM detached elements that are referenced by a WeakMap. Any suggested date or workaround?

Use a map and take care of the steps manually.

const observedNodesMap: Map<HTMLElement, () => void> = new Map();

export const observeChildren = <T extends HTMLElement, K = keyof T>(
  node: T,
  callback: () => void,
  attributes?: (Lowercase<K extends string ? K : string> | keyof AriaAttributes)[]
): void => {
  // node might not be defined in connectedCallback
  if (node) {
    observedNodesMap.set(node, callback);
    // init observer or whatever you want to do
  }
};

export const unobserveChildren = <T extends HTMLElement>(node: T): void => {
  observedNodesMap.delete(node);
};

And then in your components

public connectedCallback(): void {
  observeChildren(this.host, () => { ... });
}

public disconnectedCallback(): void {
  unobserveChildren(this.host);
}

This could be made reusable and less verbose with a custom decorator or reactive controller.

vaibhavshn commented 4 months ago

Any updates here? I'm facing the same performance issue related to DOM detached elements that are referenced by a WeakMap. Any suggested date or workaround?

Use a map and take care of the steps manually.


const observedNodesMap: Map<HTMLElement, () => void> = new Map();

export const observeChildren = <T extends HTMLElement, K = keyof T>(

  node: T,

  callback: () => void,

  attributes?: (Lowercase<K extends string ? K : string> | keyof AriaAttributes)[]

): void => {

  // node might not be defined in connectedCallback

  if (node) {

    observedNodesMap.set(node, callback);

    // init observer or whatever you want to do

  }

};

export const unobserveChildren = <T extends HTMLElement>(node: T): void => {

  observedNodesMap.delete(node);

};

And then in your components


public connectedCallback(): void {

  observeChildren(this.host, () => { ... });

}

public disconnectedCallback(): void {

  unobserveChildren(this.host);

}

This could be made reusable and less verbose with a custom decorator or reactive controller.

I am not sure I understand how this would solve or work around the bug here. The underlying weakmap is still being used to store the objects that is causing the bug right?

joewoodhouse commented 2 months ago

Sorry to be that guy, but any updates on this issue?

It's fast approaching 2 years (!!) since I raised this issue. This is a real memory leak, which causes mine and probably everyone else's production apps to crash/restart.

We're currently facing a serious incident with a customer which basically boils down to the app crashing due regularly due to this memory leak. If I had an option to move off of Stencil/Ionic at the moment I would, because of this, which makes me sad to say as I've been an Ionic dev since day 1.

Any update would be much appreciated!

christian-bromann commented 2 months ago

@joewoodhouse unfortunately we don't have an update at the moment. We investigated the issue without success. We were also facing competing priorities which makes it difficult to give this issue the necessary attention it deserves. We would appreciated any support in the investigation of this problem. Thank you for understanding!

joewoodhouse commented 2 months ago

@christian-bromann thanks for your response. Though your use of the past tense is a bit worrying, it sounds like the investigation has now completed and nothing is currently being done on it. @alicewriteswrongs was I think looking at it this time last year, but I assume no longer.

I like other people in this thread work in a small software team so I totally appreciate you can have competing (and sometimes conflicting!) priorities, but this issue I would say is completely fundamental to the Stencil and Ionic ecosystem. Once people work out that the underlying Stencil runtime leaks memory just through regular every day usage, they'll think twice about using Stencil/Ionic, no matter how fantastic the product is.

In terms of support I've spent quite a considerable amount of time investigating and reporting this issue already, but I'm very happy to do whatever it takes to move it forward - let me know specifically what support you need and I'll jump on it immediately. As I've said above, this issue causes daily dev work at our company and constant issues/disappointment for our customers, so I will do anything I can to help solve it.

DaveKilbey commented 2 months ago

@christian-bromann I'm the CEO of the company that @joewoodhouse works for. As he has clearly shown this bug is a fundamental flaw in the system, which is not just an inconvenience, it is causing us to lose customers that we've worked for years to obtain - so it's something we take extremely seriously. As Joe says, despite being a small business we are very happy to support you to resolve this. However, I expect a plan and a clear structure to work with your team on this. If you would like our help with it, please let me know and we can work out the details.

AaronDDM commented 2 months ago

a possible solution without breaking the current WeakMap logic could be to have a method unregisterHost which removes the element from the hostsRef WeakMap.

https://github.com/ionic-team/stencil/blob/66ce0d330b296ad8ac1ef7ebb5732816af5a6244/src/client/client-host-ref.ts#L6

However, there must be some core callback where that function would be called.

My guess we be somewhere around here in the disconnectedCallback:

https://github.com/ionic-team/stencil/blob/66ce0d330b296ad8ac1ef7ebb5732816af5a6244/src/runtime/bootstrap-lazy.ts#L128

Has anyone attempted this solution? Is there some other expected area that components should be GC after use, but isn't? Especially since a weakmap is being kept - I assume somewhere there is some unregister event occurring.

capc0 commented 2 months ago

@AaronDDM I have attempted that solution last year for several hours myself - unfortunatly to no success. disconnectedCallback is not the correct place to release the reference from the WeakMap. As far as I can remember debugging this issue, there are really only two options:

  1. component libraries using stencil (e.g. ionic), would have to remove the references from the WeakMap when they know that the element is destroyed (and not re-used) manually.
  2. change the implemenetation logic regarding the WeakMapp alltogether
EmanueleColombo00 commented 1 month ago

Hi @christian-bromann , I am in the same position of @DaveKilbey and his company, how can we help you find out a solution to this crucial bug? Please reach out and tell us. This is a huge problem and if not solved it will force my company and I to change framework and no longer use Ionic... which we obviously do not want to because it would cause us a huge damage in terms of time and money

christian-bromann commented 1 month ago

how can we help you find out a solution to this crucial bug? Please reach out and tell us.

Any investigations on this issue is appreciated. We are happy to help review pull requests or answer on technical questions you have.

LeonardoColombo9 commented 1 month ago

Hi @christian-bromann , please read this. I’m the CEO of an Italian start-up that chose Ionic to create its social network. Our business relies entirely on our app, so we can’t afford this crash that is pushing us out of business before we’ve even started. We put in years of unpaid personal work and invested money with our partner to achieve our dream of creating a social network that could help people discover and enhance their talents without spending any money. With this issue currently unresolved, our app is completely useless and our efforts and dreams might vanish in a few months.

I’m not trying to be rude but direct, and I’m telling you this to make you truly aware of the gravity of this problem. If you are already aware, I urge you to take action and address the problem once and for all. I understand you’re a relatively small team with a heavy workload, and we are in the same position. We could go out of business because of this bug.

However, from reading the issues reported in this chat compared to those in other chats, this is undoubtedly your biggest problem right now. I can’t imagine you would disagree with me. From a business perspective, this is a significant image damage for your project, not just ours. Any moment you wait to address it, it becomes bigger because software houses start advising against your project. That’s really unfortunate because, apart from this issue, it’s a very good and innovative software. Your communication and graphics are great, your placement is good, but so many competitors are emerging (see Flutter, for example).

This is a community, right? So we can help each other. I saw a lot of people concerned about this problem (more than double compared to the second most reported bug), and I saw some offering help. We are fully prepared to support if we can, but you cannot just say “we accept a pull request on that.” Come on… you should share a plan with us and tell us what we can do to help you. Otherwise, it would be way too expensive for us with the same catastrophic result. We understand coding is not easy, especially if some of the original developers of the Stencil plug-in have left, and business is not easy either because it pushes in every direction. But with all due respect, you have to check your priorities here. If you are already working on it, remember that people need you and depend on you for this one, so let us know you are here. It’s been two years since this bug was validated, and it’s time to solve it. Otherwise, we are all in deep trouble.

I hope you will take this the right way. I’m just trying to say what everybody is thinking in a language that is not my native one, so I apologize if I conveyed a confusing message. Please reach out ASAP.

raymondboswel commented 1 month ago

@joewoodhouse I cloned your reproduction repo and changed it so that the display and dismiss of the modal are both triggered manually (no setTimeout). The memory leak then disappears. Perhaps there's the reference created in the setTimeout closure plays a role here? I tried this because I tried to create a reproduction on our company design system, and similarly managed to create some memory leaks with setTimeout, but when manually adding / removing the elements from the DOM couldn't make it leak. A real codebase usually does have some setTimeouts for various reasons, so I think in practice creating leaks like this is very easy.

joewoodhouse commented 1 month ago

@raymondboswel I'm afraid I don't see the same things here. I've just updated my reproduction repository to include a mode that just manually opens and closes the modal without any setTimeouts involved, and I still see the memory leak occur. Maybe try out my latest version and let me know what you see?

I don't think there can be any doubt there is a memory leak here, @alicewriteswrongs has confirmed my findings and that there are memory leaks in the underlying runtime and vdom implementation already https://github.com/ionic-team/stencil/issues/3607#issuecomment-1402102459

raymondboswel commented 1 month ago

@joewoodhouse I've forked your repo and pushed the changes to the remove-timeout branch. I see our implementations differ a bit, which may be part of it. Will you clone and run it on your side and let me know what you find? I'm not calling the existence of the leak into question, as I too have been burned by it. But I think having a reproduction that has the least possible moving parts and confounders is crucial. You can view my fork here: https://github.com/raymondboswel/stencil-modal-memory-investigation

joewoodhouse commented 1 month ago

@raymondboswel I've just tried your form on the remove-timeout branch and I still see the memory leak. Maybe you could explain how you're determining whether it's there or not?

I'm running in Chrome and viewing the Performance Monitor tab, seeing the number of DOM nodes and JS heap size steadily increase. Additionally I'm taking heap snapshots and comparing them and seeing large numbers of Detached elements and leaked Modal elements. Examining their retainers show's references to lazyInstance and other internal Stencil runtime/vdom mechanics.

raymondboswel commented 1 month ago

@joewoodhouse Also looking at the perf monitor tab. Will you send me an email at @gmail.com then I'll send you a screencast of what I'm seeing?

AaronDDM commented 1 month ago

You can also do a simple check on the stencil hostref variable on the window interface. You'll see that it just keeps growing.

The issue with the way WeakMap is implemented is that if any code (e.g even ion modal) holds onto the host element reference, the GC will never kick in and the hostref map will never be cleaned up.

Safest way to approach this is to detect when an element is being unmounted and becomes detached and get rid of it from this map.

capc0 commented 1 month ago

Some more thoughts:

AaronDDM commented 1 month ago

@capc0 I tried storing it as a WeakRef and it didn't solve the issue. It's likely that code outside of the stencil core is also holding onto the reference, so GC never kicks in and this is not cleaned up.

christian-bromann commented 1 month ago

Thanks for all your comments and feedback. We are totally aware of the urgency of this issue and have been taking time apart within the next weeks to take another stab at this. Due to limited resources we appreciate any investigations done on the origin of the leak which can help accelerate the process on publishing a version with a fix.

spion commented 1 week ago

Has anyone tried running the reproduction case in the original issue recently? I'm getting regular cleanups every 20 seconds or so (the pause period was because I switched to another tab which throttles timers severely in Chrome). Below is a 60s recording:

image

Chrome Version 127.0.6533.88 (Linux)

GC appears to be working now... unless I'm missing something?

context: I like ionic a lot and use it for a side-project, so I'm working on a fix that could potentially get rid of the retained element (or at least give a definite indication whether that is the problem or not). But I'm not actually able to reproduce the issue at all to try and validate the fix.