microsoft / pxt-microbit

A Blocks / JavaScript code editor for the micro:bit built on Microsoft MakeCode
https://makecode.microbit.org
Other
727 stars 638 forks source link

micro:bit hangs when connected via webUSB and running any code #5734

Open brian-forwardedu opened 4 months ago

brian-forwardedu commented 4 months ago

Describe the bug When the microbit is connected via webUSB to makecode, any program downloaded the microbit stops running at random intervals. It also disconnects from makecode requiring the user to unplug and re plug-in the microbit to get connected again and start the code. The same behavior is not experienced when the code is running via a power-only USB plug or if not connected to webUSB. It never resets and has no issues.

To Reproduce Steps to reproduce the behavior:

  1. Go to makecode.microbit.org
  2. Create some simple code using the display (eg alternating numbers)
  3. Connect your microbit via USB.
  4. Connect the microbit using the webusb connectivity and ensure you see the microbit logo
  5. Download your code to the micro:bit
  6. Wait while your code is run while still connected to makecode. Do not let the computer go to sleep or browse away from the makecode tab in Chrome.
  7. Eventually the microbit will stop running your code and is disconnected from makecode. However it is still accessible via the flash drive
  8. Unplugging the microbit and plugging it back in will reconnect and the code will start running again.

Expected behavior WebUSB should not: A) Cause the microbit to hang while running users code B) Force makecode to disconnect from the micro:bit

micro:bit version (please complete the following information): 2.21

Desktop (please complete the following information):

Additional context Add any other context about the problem here.

microbit-timeout.zip

brian-forwardedu commented 4 months ago

We've seen this with numerous microbits on various systems, however they've all be v2's. We've just started testing with v1's and it doesnt appear to have the same issue. However, we'll continue testing to be sure.

brian-forwardedu commented 4 months ago

Attaching details from a micro:bit experiencing this issue.

DETAILS.TXT

edbye commented 4 months ago

This seems very similar, as you note, to the issue I raise ( https://github.com/microsoft/pxt-microbit/issues/5710 foundation ticket https://support.microbit.org/helpdesk/tickets/77229)

If you observe carefully, you will find that the V1 items make a reset. The V2's crash.

Ed

brian-forwardedu commented 4 months ago

Interesting that you say V1 makes a reset, because the V1 seems rock solid from what we can see. Though I havent been watching any serial console. V1 appears to very gracefully reconnect via WebUSB to makecode again as if nothing happened - even after the computer wakes from sleep.

edbye commented 4 months ago

My test code was to implement a sequential counter, starting from 1 and incrementing/displaying every loop iteration.

When issue arises, then counter should be large; I found it was close to one = device reset.

brian-forwardedu commented 4 months ago

Good catch!

On Thu, Jul 18, 2024 at 12:18 PM edbye @.***> wrote:

My test code was to implement a sequential counter, starting from 1 and incrementing/displaying every loop iteration.

When issue arises, then counter should be large; I found it was close to one = device reset.

— Reply to this email directly, view it on GitHub https://github.com/microsoft/pxt-microbit/issues/5734#issuecomment-2237011584, or unsubscribe https://github.com/notifications/unsubscribe-auth/BBQKZYDSM7G3GNMD73OVXC3ZM7TDRAVCNFSM6AAAAABLBC45NSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMZXGAYTCNJYGQ . You are receiving this because you authored the thread.Message ID: @.***>

abchatra commented 4 months ago

Thanks for the issue.

brian-forwardedu commented 4 months ago

@abchatra is there any workaround for this issue or an expected timeline for a fix?

abchatra commented 4 months ago

We are planning to look at all webusb issues in next month or so. Hopefully we can reproduce and produce a fix.

brian-forwardedu commented 4 months ago

Thanks Abhijith. This is currently impacting customers who use our micro:bit accessory product, so any support is appreciated.

microbit-carlos commented 3 months ago

I've tried to replicate this issue but I wasn't able to get the same result.

These are the steps I took:

I've checked the micro:bit every 10-15min for 2h and the programme was still running. I assume if this issue presented itself the programme would not be running anymore? How does that manifest? Is the display showing nothing, or maybe randon LEDs light up?

edbye commented 3 months ago

I tried the https://github.com/microbit-carlos test code under Win11 running MS Edge. all other details same. The V2.21 crashed within the hour = LED matrix goes blank.

martinwork commented 3 months ago

Here's my test code https://makecode.microbit.org/_0HgCmP7Ex2CV. Press A before leaving and observe skull for reset, blank display for hang. I cannot repeat the unattended hangs, but have noticed that reloading the MakeCode page causes V2 to hang with a blank screen around 30-50% of the time. Perhaps some computers are triggering a WebUSB glitch, which leads to a reset or hang.

edbye commented 3 months ago

There appears to be two issues here:

  1. In some computers/operating systems, webusb sends an illegal command to attached Microbit.
  2. The attached Microbit responds inappropriately to illegal commands.
martinwork commented 3 months ago

It seems launching https://makecode.microbit.org/#editor or switching tabs causes a reset, but never a hang. I can only produce a hang by the "Reload this page" button.

brian-forwardedu commented 3 months ago

It appears that our extension exacerbates the issue, but it definitely occurs without the extension as well. Since our extension relies on "Connected" to the micro:bit, it can be observed quicker, but within minutes we see a disconnection from the micro:bit without having changed tabs, browsers, or any other input to the machine.

It's worth noting that we dont see the same troubles on Mac OSX machines, at least not to the same extent. This seems to primarily affect Windows.

FWIW, running @martinwork code we get a restart in <3 minutes.

brian-forwardedu commented 3 months ago

Another condition that's interesting is the behaviour when having two makecode tabs open at the same time causing the microbit to reset. Try the following:

  1. Open a browser window
  2. Navigate to makecode
  3. Physically connect the micro:bit
  4. Connect microbit via makecode
  5. Download code (like @martinwork's link)
  6. Open a new tab and navigate to makecode again

Notice that the USB Connection switches from the first tab opened to the second one. When it does this it resets the microbit. You can switch between tabs and the connection switches between them, and every time there is a reset.

martinwork commented 3 months ago

@brian-forwardedu It's interesting that you have seen a spontaneous reset, rather than hang, but for a task that needs to stay connected, a reset is almost as bad as a hang. Switching tabs is known to cause a reset.

When I tested, I did not spontaneous resets in Windows 10 or 11, so presumably there must be some difference between our Windows computers, and maybe the questions are:

microbit-carlos commented 3 months ago

It seems launching https://makecode.microbit.org/#editor or switching tabs causes a reset, but never a hang. I can only produce a hang by the "Reload this page" button.

I had a quick look at this one and can replicate as well that refreshing the MakeCode window causes the LED display to clear and hang.

This is because of https://github.com/microsoft/pxt-microbit/issues/5530. Essentially if we refresh the window we can interrupt a MakeCode serial read command before DAPLink sends a response, and DAPLink gets in this "off-by-one response" state. Then, the new MakeCode session tries to reconnect, the first command it send is this halt and at that point we get the error as DAPLink sends the wrong response, in this case for the serial read command 131: https://github.com/microsoft/pxt-microbit/blob/974c9fad9f63555ca4f0fcaf0c829c3a69faf8f6/editor/flash.ts#L313-L317

editor.js:2391 Uncaught (in promise) Error: Bad response for 5 -> 131
    at CMSISDAP.<anonymous> (editor.js:2391:35)
    at step (editor.js:2317:23)
    at Object.next (editor.js:2298:53)
    at fulfilled (editor.js:2289:58)
microbit-carlos commented 3 months ago

Thanks everyone for continue looking into this and helping with all the debugging! A few things to cover:

I tried the https://github.com/microbit-carlos test code under Win11 running MS Edge. all other details same. The V2.21 crashed within the hour = LED matrix goes blank.

@edbye If you run a programme like the one Martin posted, where is easy to check if the board has been reset (pressing A changes the icon): https://makecode.microbit.org/_0HgCmP7Ex2CV Does your set up sometimes show the skull after a while?

I've cover this in my previous command, but essentially the "LED matrix goes blank" issue is likely caused by https://github.com/microsoft/pxt-microbit/issues/5530 (doesn't need a battery pack, if MakeCode stops and restarts a connection, for example if something refreshes the tab or puts it to sleep, it can get into this state).

So, my current theory is that maybe in your environment the board get's reset often enough to trigger it.

The open question would be why it restarts the board, and does that also cause anything else weird to happen with MakeCode.

It appears that our extension exacerbates the issue, but it definitely occurs without the extension as well. Since our extension relies on "Connected" to the micro:bit, it can be observed quicker, but within minutes we see a disconnection from the micro:bit without having changed tabs, browsers, or any other input to the machine.

@brian-forwardedu to make sure I get the details right, could you specify what would be a "disconnection from the micro:bit"? Does the micro:bit reset and continue running the programme from the beginning? Does the display go blank and no programme runs? Something else?

FWIW, running @martinwork code we get a restart in <3 minutes.

@brian-forwardedu To double check, this means that Martin's code shows that leaving the micro:bit connected to a MakeCode window, without touching anything, always with the tab active, and the micro:bit is reset within 3 minutes? If that's the case, could you open the chrome JS console (f12), wait for the reset, and copy all the output from the console here? That would be really useful to understand what might be happening.

Another condition that's interesting is the behaviour when having two makecode tabs open at the same time causing the microbit to reset. Try the following:

  1. Open a browser window
  2. Navigate to makecode
  3. Physically connect the micro:bit
  4. Connect microbit via makecode
  5. Download code (like @martinwork's link)
  6. Open a new tab and navigate to makecode again

Notice that the USB Connection switches from the first tab opened to the second one. When it does this it resets the microbit. You can switch between tabs and the connection switches between them, and every time there is a reset.

@brian-forwardedu thanks for looking into this example as well. To make sure things don't get too mixed up, as I think this probably counts as a different thing, could you open this in a new GitHub issue? We can continue the conversation about this specific case over there. Thanks!

brian-forwardedu commented 3 months ago

Hi @microbit-carlos , thanks for responding.

To be clear, we do not see hangs at all with

This is only occurring when connected to a V2 micro:bit connected via USB and connected to makecode.

Does the micro:bit reset and continue running the programme from the beginning?

No it does not.

Does the display go blank and no programme runs?

Yes, that's correct.

To double check, this means that Martin's code shows that leaving the micro:bit connected to a MakeCode window, without touching anything, always with the tab active, and the micro:bit is reset within 3 minutes? If that's the case, could you open the chrome JS console (f12), wait for the reset, and copy all the output from the console here? That would be really useful to understand what might be happening.

Yes, no problem. See attached. This all occurred within <3 minutes:

(index):139 An iframe which has both allow-scripts and allow-same-origin for its sandbox attribute can escape its sandboxing.
l @ (index):139Understand this warning
fe98398a199e.js:1 

       Failed to load resource: net::ERR_BLOCKED_BY_CLIENTUnderstand this error
pxtapp.js:1 Browser: chrome 127.0.0.0 on windows
pxtapp.js:1 workspace: browser
pxtapp.js:1 PouchDB adapter: idb
dc.services.visualstudio.com/v2/track:1 

       Failed to load resource: net::ERR_FAILEDUnderstand this error
pxtsim.js:1 Simulator ServiceWorker registration successful with scope:  https://trg-microbit.userpxt.io/
pxtapp.js:1 Uncaught (in promise) TypeError: Cannot read properties of null (reading 'match')
    at e.initials (pxtapp.js:1:54364)
    at main.js:1:301299Understand this error
dc.services.visualstudio.com/v2/track:1 

       Failed to load resource: net::ERR_FAILEDUnderstand this error
pxtapp.js:1 packetio: mk wrapper dap wrapper
pxtapp.js:1 webusb: get devices
dc.services.visualstudio.com/v2/track:1 

       Failed to load resource: net::ERR_FAILEDUnderstand this error
pxtapp.js:1 webusb: get devices
editor.js:2415 Connecting...
editor.js:2446 Connected
editor.js:2415 Connecting...
editor.js:2446 Connected
dc.services.visualstudio.com/v2/track:1 

       Failed to load resource: net::ERR_FAILEDUnderstand this error
pxtapp.js:1 webusb: get devices
editor.js:2415 Connecting...
editor.js:2446 Connected
dc.services.visualstudio.com/v2/track:1 

       Failed to load resource: net::ERR_FAILEDUnderstand this error
editor.js:3317 DOMException: Failed to execute 'transferOut' on 'USBDevice': A transfer error has occurred.
readSerialLoop @ editor.js:3317Understand this error
pxtapp.js:1 webusb: get devices
editor.js:2415 Connecting...
editor.js:2446 Connected
7The FetchEvent for "<URL>" resulted in a network error response: the promise was rejected.Understand this warning
---serviceworker:1 

       Uncaught (in promise) TypeError: Failed to fetch
    at ---serviceworker:1:10224Understand this error
dc.services.visualstudio.com/v2/track:1 

       Failed to load resource: net::ERR_FAILEDUnderstand this error
editor.js:3317 DOMException: Failed to execute 'transferIn' on 'USBDevice': A transfer error has occurred.
readSerialLoop @ editor.js:3317Understand this error
pxtapp.js:1 webusb: get devices

2024-08-21 08_01_04- javascript_console_githubissue_5734.txt

@microbit-carlos do you mean that you would like a separate github issue for the tab-switching-reset issue? The above output is for a hang/crash of the micro:bit itself.

microbit-carlos commented 3 months ago

That's really interesting @brian-forwardedu, and thanks for attaching a screenshot as well! The errors show when the problem occurs, but most interesting is that the MakeCode editor is constantly re-connecting. How long do you think it passed from the moment the MakeCode session started and the screenshot was taken? As it had to reconnect 4 times in that screenshot.

Do you know why it loses the WebUSB connection? Do you constantly see that micro:bit logo blinking in the "Download" button (it probably does that every time it reconnects disconnects).

If this all happens in a short period of time (3min or so), would you be able to do a screen recording of that Chrome window with the console opened? It'd be interesting to see it happening with all the connects/disconnects.

I don't think ublock would cause this issue, but if you could disable the extension that would be great as well to be able to discard it as the cause.

@microbit-carlos do you mean that you would like a separate github issue for the tab-switching-reset issue? The above output is for a hang/crash of the micro:bit itself.

Yes, exactly, as the tab switching reset is essentially different, it's better to keep that discussion separated. Thanks!

brian-forwardedu commented 3 months ago

@microbit-carlos I had trouble replicating what I sent earlier in the last hour or so with the window staying in focus. This definitely occurred with it staying in focus:

This error definitely continues to happen with the window in focus:
DOMException: Failed to execute 'transferOut' on 'USBDevice': A transfer error has occurred. 2024-08-21 13_45_58-

The microbit stays connected but resets.

Shortly after this occurred when I moved away from the screen: editor.js:3317 DOMException: Failed to execute 'transferIn' on 'USBDevice': A transfer error has occurred. 2024-08-21 13_48_02-

The microbit disconnects and I get the blinking "Download" button.

I was able to reconnect and this is what was reported on the screen: 2024-08-21 13_52_32-

I tried disabling ublock origin and it doesnt seem to have an effect.

Ill continue to try and capture a screen recording.

brian-forwardedu commented 3 months ago

From what I can tell the only message that occurs when the crash and flashing connection is happening in makecode is the following log line:

image

edbye commented 3 months ago

Hi @microbit-carlos, My original report was of a very similar, at least in part, to the issue being considered here, see micro:bit V2 may hang when WebUSB connected to MakeCode · Issue #5710 · microsoft/pxt-microbit (github.com) I was primary concerned with crashing whilst monitoring and charting on the Makecode console using serial write from an attached device, no battery power. Hence to date my testing has been performed without battery, just USB. I ran Martin’s test code, https://makecode.microbit.org/_0HgCmP7Ex2CV as requested, when powered USB only and also USB with battery power, both options a few times. My results are that in the main the attached Microbit crashes (hangs) after a while irrespective if USB only powered or with battery. On occasion under both power options, before the crash a reset occurs (ghost icon) for a second or so. When with battery power the reset condition (ghost icon) remains. I read somewhere that there is belief that the “hang” is a “sleep” mode. I doubt this as my test program using “play tone” see https://makecode.microbit.org/_E3C7pJPmDCXJ produces on occasion (depending if condition occurs when sound is playing or not) a continuous note. Some repeat testing probably required to reproduce this, be patient with tries. Hope that helps Ed

edbye commented 3 months ago

A while back, I experimented in monitoring the USB interface, file attached DMS USB monitor program output.docx I didn't know really what to do or make or the information I found, hope it might be of help to someone in pursuing this issue. Regards Ed

abchatra commented 3 months ago

@edbye @brian-forwardedu please test BETA -> makecode.microbit.org/beta as well as we have fixed few issues and would love to see if this is fixed or not.

jwunderl commented 3 months ago

I was able to reproduce the hang by refreshing the page while putting my device under load with some heavier programs, but not the "breaking on it's own after a few minutes" type of behavior so far -- 7.0.21 has a small fix that has gotten rid of the errors when refreshing the tab (on my device) as a starting point, https://makecode.microbit.org/beta will have that up soon for testing, continuing to look into this tomorrow.

edbye commented 3 months ago

Hi, I tested Beta version as requested, here is what I did and results: Using MS Edge, launched https://makecode.microbit.org/beta in new tab (current release of Makecode remained loaded in another tab of Edge). Imported file from local folder a copy of my test prog to “Beta” https://makecode.microbit.org/_E3C7pJPmDCXJ Downloaded to V2 Microbit code via webusb (pairing already known, did not need to re-pair) Run program on device (no battery) and monitored in Makecode “show data device” option (simulation stopped) I noted two different outcomes: In the first to occur, the Makecode console scroll chart and data values ceased to update after a short while. The attached device LED matrix and sound performed as expected, however there was no serial comms (yellow LED) on rear did not flicker on each loop iteration after the issue occurred. The device continued to function, without serial USB for over 3 hours, until test was terminated. After test, Makecode pairing was non-existent, but could be re-established by removing and reinserting USB, so no change with that aspect. The second outcome observed was that when the issue occurred the attached device reset (ghost + melody). Makecode pairing was non-existent, but could be re-established by removing and reinserting USB. In case there were influences from the current Makecode tab in Edge, it was closed, the device unpaired from USB and the test program was again downloaded after paring. Summarising, with the changes made in “Beta” to date, the attached Microbit device did not hang or crash fully, but ceased to work in part. There appears to be no change in the response of Makecode/webUSB on the host,

jwunderl commented 3 months ago

Okay, i tested using that project and left it running for around 6 hours without reproducing it dropping the serial connection; I did see https://github.com/microsoft/pxt-microbit/issues/5832 reproduce in that, though, so could be related / investigating in that direction. Thank you for checking on this!

martinwork commented 3 months ago

@edbye Are you still seeing the glitches occur on clock time 15 minute boundaries, as in https://github.com/microsoft/pxt-microbit/issues/5710?

edbye commented 3 months ago

Yes, currently on the hour and at half- past on the dot, the malfunction will occur. It does seem to change (15 min) intervals, but as just said, for the last few days, it will precisely occur on the hour and/or half past!

jwunderl commented 3 months ago

I have not been able to reproduce any sort of consistent / timed resets or malfunctions after testing on windows 11 / mac either. After poking around a bit I did see a few references to windows 10 users having similar experiences with keyboards / other usb devices being 'partially connected' that pointed at https://pocketables.com/2016/04/fix-windows-10-keyboard-disconnecting.html as a possible solution (hadn't seen the article before / don't immediately have a windows 10 device available to test). If you are able to test that except selecting micro:bit hardware, would be helpful if we can try and narrow down the root cause as it being consistent / timed like that is very suspicious.

brian-forwardedu commented 3 months ago

With /beta I have been unable to get the microbit to hang, however it does still disconnect if the window isnt active. Before this could cause the microbit to hang, but doesnt appear to any longer. More testing is needed.

Here are the messages when the microbit disconnects from makecode (the microbit logo flashes as well)

image

edbye commented 3 months ago

Thanks for comments and suggestions, jwunderl

Power and power management was one of the first things I checked; voltage is solid all times via USB as measured on the Microbit power pads. Plus my specific issue also occurs with battery power also.

In Win11 the USB power management option is available via settings – “Bluetooth & devices” > USB = “USB battery saver”, now turned off

In device manager, there are also power management options for each USB controller & hub, all these are now turned off. In the Beta version, I ran my scroll-chart test program for a period of 3 hours. It played up 5 times, 4 times the Microbit reset yet Makecode (webusb) lost pairing. The other occasion Microbit serial data failed (no yellow LED flicker), of course there was no update to Makecode data/graph.

Timing between issues alternated between 30 mins and 45 mins.

Edge console reported in each case a transfer out, failed to execute network error on line 3327 which is in the function “startReadSerial(connectionId) { …}. Note that this is the same line number as specified by brian-forwardedu but with different cause.