decentraland / sdk

PM repository for SDK
Apache License 2.0
4 stars 4 forks source link

[BUG] Error in transport locks up entire scene #1083

Open wacaine opened 4 months ago

wacaine commented 4 months ago

Issue Description:

I have a bug that may or may not be related to my code BUT feels like in part could be improved on the client side. I cannot consistently reproduce it but once it happens the entire scene stops processing. I can walk around but cannot interact with anything. Point downs system, triggers etc. nothing that was working before continues to work.

I understand if my code does something to throw an error, but how is it is stopping processing of the scene as a whole. I think a try/catch is missing somewhere in the client to allow it to recover. I have yet been able to isolate it. It maybe a race condition but when this error does appear, nothing in my scene works again until I reload it.

"Error: Uncaught Error: [create] Component core::Transform for 131086 already exists at Worker. (https://cdn.decentraland.org/@dcl/explorer/1.0.160612-20240226145831.commit-52cf906/index.js:63296:123523) at Worker.gr (https://cdn.decentraland.org/@dcl/explorer-website/2.1.1/assets/index-lLmTeOCG.js:225:1961)

SDK:

Tool:

CLI Version:
Node Version:

Steps to reproduce:

I have yet to isolate a cause.

Expected behaviour:

Scene not to become unresponsive.

Current behaviour:

I can walk around but cannot interact with anything. Point downs system, triggers etc. nothing that was working before continues to work.

Reproduction rate:

Intermittent.

Code Snippets:

I have checked my code for poor usage of Transfom.create. I have not spotted one. Maybe its still there but my point is the entire scene should not become unusable due to this.

Platforms:

Browser:

Environment:

Evidence:

Screen Shot 2024-03-01 at 10 55 12 AM

Additional Notes:

Can be reproduced here https://decentraland.org/play/?position=148%2C60&DEBUG_SCENE_LOG=&realm=main&island=default5k at random. Not sure exact steps. Related to clicking relogin, collecting coins etc.

gonpombo8 commented 4 months ago

I solved this on the latest release. Can you try using @dcl/sdk@latest and see if still happens ? @wacaine

wacaine commented 4 months ago

@gonpombo8 I have been unable to recreate it locally and I thought that anything deployed automatically inherits latest SDK? is that not true? Or am I confusing that with the engine

nearnshaw commented 4 months ago

@wacaine everything deployed inherits the latest version of the engine, but not of the SDK The change we're discussing here is very much on the side of the SDK, so in order to fix this in production, it would require updating the SDK version to the latest and redeploying. I'm quite confident that the changes that Gonzalo recently introduced (last week I believe) should prevent this kind of thing from happening. The scene likely is making some mistake (like attempting to fetch a component that doesn't yet exist, or re-adding a component that is already there), but that shouldn't have so dire consequences as completely crashing the entire scene.

wacaine commented 4 months ago

@nearnshaw @gonpombo8 we deployed our scene maybe 24 hours ago (https://peer.decentraland.org/content/entities/scenes?pointer=145,60) with this in our package-lock file

 "@dcl/js-runtime": "7.4.7",
    "@dcl/sdk": "^7.4.7".

Which version is it fixed on? II just got the error again today. Th whole scene becomes unresponsive when this happens. I see this error message mentioned Component core 2.2.0 vs last screen shot of 2.1.1. Is the issue in this referenced component core file or somewhere else?

Screen Shot 2024-03-05 at 5 20 04 PM

We are using some libraries. Do all libraries have to be patched with latest SDK too or can the library still be susceptible to the bug? sdk7-utils for example?

I have audited all usages of Transform.create within the scene. Only time we use create is during when the entity is created. All follow up use is Transorm.createOrReplace. For good measure I could mass find/replace createOrReplace EVERYWHERE. That would rule out it being any kind of problem related to our immediate code. At that point libraries are to blame possible?