google / aiyprojects-raspbian

API libraries, samples, and system images for AIY Projects (Voice Kit and Vision Kit)
https://aiyprojects.withgoogle.com/
Apache License 2.0
1.63k stars 694 forks source link

Vision kit JOY sample stops working #346

Open saket424 opened 6 years ago

saket424 commented 6 years ago

I have a new Vision Kit from Target that I assembled. After working initially for a few minutes, the vision kit JOY sample stops working with the following output in the dmesg log. I have the unit powered with a 5V 2.5 A powersupply. Any suggestions ?

[ 252.549751] Unregistered device pwm22 [ 317.266061] ------------[ cut here ]------------ [ 317.266115] kernel BUG at mm/slub.c:3873! [ 317.266126] Internal error: Oops - BUG: 0 [#1] ARM [ 317.266138] Modules linked in: fuse aiy_adc(O) industrialio gpio_aiy_io(O) pwm_aiy_io(O) cmac rfcomm bnep aiy_io_i2c(O) leds_ktd202x(O) hci_uart btbcm bluetooth aiy_vision(O) spidev usb_f_rndis u_ether usb_f_acm u_serial brcmfmac brcmutil snd_soc_bcm2835_i2s(O) regmap_mmio cfg80211 snd_soc_core rfkill snd_compress snd_pcm_dmaengine snd_pcm snd_timer i2c_bcm2835 snd spi_bcm2835 bcm2835_gpiomem uio_pdrv_genirq uio fixed pwm_soft(O) i2c_dev libcomposite dwc2 udc_core ip_tables x_tables ipv6 [ 317.266304] CPU: 0 PID: 259 Comm: python3 Tainted: G O 4.9.59+ #1047 [ 317.266315] Hardware name: BCM2835 [ 317.266328] task: d6806d00 task.stack: cccfa000 [ 317.266368] PC is at kfree+0x144/0x18c [ 317.266406] LR is at kvfree+0x54/0x5c [ 317.266419] pc : [] lr : [] psr: 40000013 sp : cccfbe18 ip : cccfbe40 fp : cccfbe3c [ 317.266432] r10: 0000084b r9 : d8826000 r8 : c00eddd0 [ 317.266442] r7 : c00ee078 r6 : c01071b4 r5 : cccfbe50 r4 : cce13f9d [ 317.266452] r3 : d7e58c4c r2 : 00000100 r1 : 00000100 r0 : cccfbe50 [ 317.266468] Flags: nZcv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user [ 317.266479] Control: 00c5387d Table: 0cde4008 DAC: 00000055 [ 317.266490] Process python3 (pid: 259, stack limit = 0xcccfa188) [ 317.266500] Stack: (0xcccfbe18 to 0xcccfc000) [ 317.266513] be00: 00000001 c0914e10 [ 317.266529] be20: cce13f80 cce13f9d cccfbe34 cccfbe38 cccfbe4c cccfbe40 c01071b4 c0134e58 [ 317.266545] be40: cccfbe6c cccfbe50 c012514c c010716c cce13f80 00000000 000005d3 ae214018 [ 317.266561] be60: cccfbe7c cccfbe70 c0125240 c01250c8 cccfbe94 cccfbe80 bf413c20 c01251f8 [ 317.266576] be80: d509f400 00000000 cccfbefc cccfbe98 bf4154cc bf413bb0 00000000 d509f470 [ 317.266591] bea0: d509f480 ae214008 00000001 00000018 00003a98 00c00010 0000007e 00000000 [ 317.266608] bec0: d6806d00 c004ff58 cccfbec8 cccfbec8 00000000 ae214008 d6857190 d2ee4960 [ 317.266623] bee0: 00000005 00000005 cccfa000 00000000 cccfbf7c cccfbf00 c01575f0 bf414fdc [ 317.266638] bf00: cccfbf24 c00703cc cccfbf4c cccfbf18 c00703cc c049d27c 80000000 00000009 [ 317.266653] bf20: 00000189 00000000 00000001 00000081 003eb198 c016211c cd3d5480 00c00010 [ 317.266671] bf40: ae214008 be9cdbd4 c0108903 00000005 cccfbf6c d2ee4961 ae214008 d2ee4960 [ 317.266686] bf60: c0108903 00000005 cccfa000 00000000 cccfbfa4 cccfbf80 c0157d6c c0157560 [ 317.266701] bf80: be9cdbd4 00c00010 00378000 be9cdbd4 00000036 c000ffc4 00000000 cccfbfa8 [ 317.266717] bfa0: c000fe40 c0157d34 00c00010 00378000 00000005 c0108903 ae214008 be9cdbd4 [ 317.266734] bfc0: 00c00010 00378000 be9cdbd4 00000036 00000005 c0108903 ae214008 be9cdbd8 [ 317.266749] bfe0: 003782fc be9cdbc4 000362d4 b6d9f8ac 80000010 00000005 17ffa861 17ffac61 [ 317.266789] [] (kfree) from [] (kvfree+0x54/0x5c) [ 317.266827] [] (kvfree) from [] (vunmap+0x90/0xe0) [ 317.266850] [] (vunmap) from [] (vfree+0x54/0x94) [ 317.266897] [] (vfree) from [] (transaction_unref+0x7c/0xa4 [aiy_vision]) [ 317.266956] [] (transaction_unref [aiy_vision]) from [] (visionbonnet_ioctl+0x4fc/0x6a8 [aiy_vision]) [ 317.266994] [] (visionbonnet_ioctl [aiy_vision]) from [] (do_vfs_ioctl+0x9c/0x7d4) [ 317.267014] [] (do_vfs_ioctl) from [] (SyS_ioctl+0x44/0x6c) [ 317.267051] [] (SyS_ioctl) from [] (ret_fast_syscall+0x0/0x1c) [ 317.267069] Code: 1a000003 e5932014 e3120001 1a000000 (e7f001f2) [ 317.267085] ---[ end trace 8b8f08149c229173 ]---

bryantqo commented 6 years ago

I had the same issue. Ran sudo apt-get dist-upgrade and testing again now.

Kernel oops continues even after upgrade

saket424 commented 6 years ago

More logs when i run the face detection demo . The errors happen even with the 20180418 image

[ 82.575807] Stopping timer. [ 82.628704] Unregistered device pwm22 [ 130.673472] aiy-vision spi0.0: Transaction interrupted tid=1 [ 186.537079] input: raspidisp-keyboard as /devices/virtual/input/input0 [ 243.906471] ------------[ cut here ]------------ [ 243.906528] WARNING: CPU: 0 PID: 1106 at drivers/staging/vc04_services/interface/vchiq_arm/vchiq_core.c:947 queue_message+0x464/0x998 [ 243.906537] Modules linked in: evdev cmac rfcomm bnep aiy_adc(O) industrialio gpio_aiy_io(O) pwm_aiy_io(O) spidev leds_ktd202x(O) aiy_io_i2c(O) aiy_vision(O) hci_uart btbcm bluetooth usb_f_acm u_serial brcmfmac brcmutil cfg80211 rfkill i2c_bcm2835 spi_bcm2835 bcm2835_gpiomem fixed uio_pdrv_genirq uio uinput cuse fuse pwm_soft(O) i2c_dev usb_f_ecm g_ether usb_f_rndis u_ether libcomposite dwc2 udc_core ip_tables x_tables ipv6 [ 243.906684] CPU: 0 PID: 1106 Comm: uv4l Tainted: G O 4.9.80+ #1098 [ 243.906689] Hardware name: BCM2835 [ 243.906762] [] (unwind_backtrace) from [] (show_stack+0x20/0x24) [ 243.906788] [] (show_stack) from [] (dump_stack+0x20/0x28) [ 243.906820] [] (dump_stack) from [] (warn+0xe4/0x10c) [ 243.906840] [] (warn) from [] (warn_slowpath_null+0x30/0x38) [ 243.906860] [] (warn_slowpath_null) from [] (queue_message+0x464/0x998) [ 243.906882] [] (queue_message) from [] (vchiq_queue_message+0x108/0x140) [ 243.906902] [] (vchiq_queue_message) from [] (vchiq_ioctl+0x434/0x188c) [ 243.906933] [] (vchiq_ioctl) from [] (do_vfs_ioctl+0x9c/0x7d4) [ 243.906951] [] (do_vfs_ioctl) from [] (SyS_ioctl+0x44/0x6c) [ 243.906973] [] (SyS_ioctl) from [] (ret_fast_syscall+0x0/0x1c) [ 243.906982] ---[ end trace aa2dcbe0c547b42f ]--- [ 243.907041] Unable to handle kernel NULL pointer dereference at virtual address 00000015 [ 243.907071] pgd = d40a8000 [ 243.907080] [00000015] pgd=14002831, pte=00000000, *ppte=00000000 [ 243.907112] Internal error: Oops: 817 [#1] ARM [ 243.907128] Modules linked in: evdev cmac rfcomm bnep aiy_adc(O) industrialio gpio_aiy_io(O) pwm_aiy_io(O) spidev leds_ktd202x(O) aiy_io_i2c(O) aiy_vision(O) hci_uart btbcm bluetooth usb_f_acm u_serial brcmfmac brcmutil cfg80211 rfkill i2c_bcm2835 spi_bcm2835 bcm2835_gpiomem fixed uio_pdrv_genirq uio uinput cuse fuse pwm_soft(O) i2c_dev usb_f_ecm g_ether usb_f_rndis u_ether libcomposite dwc2 udc_core ip_tables x_tables ipv6 [ 243.907258] CPU: 0 PID: 1106 Comm: uv4l Tainted: G W O 4.9.80+ #1098 [ 243.907272] Hardware name: BCM2835 [ 243.907286] task: d5384420 task.stack: ce052000 [ 243.907309] PC is at remote_event_signal+0x20/0x4c [ 243.907325] LR is at queue_message+0x190/0x998 [ 243.907339] pc : [] lr : [] psr: 20000013 sp : ce053d78 ip : ce053d88 fp : ce053d84 [ 243.907353] r10: 00000010 r9 : c09c7b68 r8 : c0921524 [ 243.907365] r7 : 00000002 r6 : d75bee00 r5 : ce053ea0 r4 : d5333e00 [ 243.907375] r3 : 00000000 r2 : 00000001 r1 : 00000000 r0 : 00000011 [ 243.907390] Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user [ 243.907403] Control: 00c5387d Table: 140a8008 DAC: 00000055 [ 243.907414] Process uv4l (pid: 1106, stack limit = 0xce052188) [ 243.907428] Stack: (0xce053d78 to 0xce054000) [ 243.907444] 3d60: ce053e04 ce053d88 [ 243.907461] 3d80: c04cbebc c04d5a80 00000310 ffff0a11 00000000 00000200 00000200 c04d4a74 [ 243.907478] 3da0: 00000006 00000003 00000000 d7580194 0500300a ce053e90 c0921524 00000005 [ 243.907497] 3dc0: d5333e00 c09cbe38 fff9fee9 ffffffff 00005007 00000000 ce053e3c 0000000c [ 243.907514] 3de0: 00000002 00000010 d5333e00 ce053e90 d5333e00 00000000 ce053e3c ce053e08 [ 243.907529] 3e00: c04ce89c c04cbd38 00000002 00000010 00000001 00000010 ce053e90 400cc404 [ 243.907548] 3e20: c0921530 d6821000 00000010 00000000 ce053efc ce053e40 c04d2d58 c04ce7a0 [ 243.907564] 3e40: ae4e5d1c 00000001 d401e828 ce053ea0 0000a000 00000000 0001d000 0001d000 [ 243.907582] 3e60: 00000000 00000002 00000018 d401e828 00000000 d6bbe190 00000000 00000000 [ 243.907599] 3e80: ce053f3c 00004003 00000002 ae4e5cb8 ae4e5cac 00000004 ae4e5ce4 0000000c [ 243.907616] 3ea0: 00000001 00000000 00000000 c004dab4 000660d0 00000000 d5384420 c004db28 [ 243.907633] 3ec0: c9b70b68 c004dd94 ce053f0c ce053ed8 00000000 ae4e5c84 d7325cd0 d401ed20 [ 243.907652] 3ee0: 00000006 00000006 ce052000 00000000 ce053f7c ce053f00 c0157a58 c04d2930 [ 243.907670] 3f00: ce053f2c c00705d4 ce053f4c ce053f18 c00705d4 c04a9e7c 567b9fee 00076d83 [ 243.907687] 3f20: ffffffff ce053f88 ae4e5cc8 00000000 0000004e c0162584 d514f480 000000b8 [ 243.907704] 3f40: ae4e5c84 ae4e5c84 400cc404 00000006 ce053f6c d401ed21 ae4e5c84 d401ed20 [ 243.907721] 3f60: 400cc404 00000006 ce052000 00000000 ce053fa4 ce053f80 c01581d4 c01579c8 [ 243.907737] 3f80: 00004003 000000b8 b6a65334 ae4e5c84 00000036 c000ffc4 00000000 ce053fa8 [ 243.907754] 3fa0: c000fe40 c015819c 000000b8 b6a65334 00000006 400cc404 ae4e5c84 00004003 [ 243.907771] 3fc0: 000000b8 b6a65334 ae4e5c84 00000036 b5382a98 becf9588 b4a02bb0 0018316c [ 243.907790] 3fe0: b6a65240 ae4e5c7c b6a5403c b6bad80c 60000010 00000006 400636e6 13000000 [ 243.907828] [] (remote_event_signal) from [] (queue_message+0x190/0x998) [ 243.907850] [] (queue_message) from [] (vchiq_queue_message+0x108/0x140) [ 243.907874] [] (vchiq_queue_message) from [] (vchiq_ioctl+0x434/0x188c) [ 243.907904] [] (vchiq_ioctl) from [] (do_vfs_ioctl+0x9c/0x7d4) [ 243.907927] [] (do_vfs_ioctl) from [] (SyS_ioctl+0x44/0x6c) [ 243.907957] [] (SyS_ioctl) from [] (ret_fast_syscall+0x0/0x1c) [ 243.907982] Code: e8bd4000 e3a03000 ee073f9a e3a02001 (e5802004) [ 243.908082] ---[ end trace aa2dcbe0c547b430 ]---

cmrigney commented 6 years ago

I have the same issue but haven’t had the chance to check dmesg. I just bought it from Target today, and it seems to randomly freeze, although it usually works for a half a minute or so. Using a 2.1 amp supply. I hope this gets resolved soon. Stinks it’s not working after I just bought it.

cmrigney commented 6 years ago

It just now froze in the midst of starting and pulsing the piezo, so the piezo beeped until I pulled the plug. Seems random.

cmrigney commented 6 years ago

I’m starting to think it’s a hardware issue, because the longer I leave it off before starting it back up, the longer it works without freezing. I’m going to exchange it and see if that fixes it.

bryantqo commented 6 years ago

I am seeing the same thing. Also leaving the back open after it's cooled off makes it run longer. I am going to pull mine out of the box and put a cpu fan on it

On Sat, May 5, 2018, 9:39 AM Cody Rigney notifications@github.com wrote:

I’m starting to think it’s a hardware issue, because the longer I leave it off before starting it back up, the longer it works without freezing. I’m going to exchange it and see if that fixes it.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/google/aiyprojects-raspbian/issues/346#issuecomment-386806380, or mute the thread https://github.com/notifications/unsubscribe-auth/AKFuv_75Rw8sJbC8-A5yMGtkpMnELRIzks5tvat7gaJpZM4TqiA2 .

cmrigney commented 6 years ago

Status update. I exchanged it at Target and the new one works flawlessly! It’s as if there are some that have an overheating issue. But the new one is fully enclosed and all, no mods, yet works fine.

Shows me for always skipping the first item on the shelf for fear of buying a man handled product :D

saket424 commented 6 years ago

I bought 3 of them. 2 were faulty and I had to return it and the third one was fine. So it is definitely some marginal movidius hardware causing us grief.

cmrigney commented 6 years ago

Hmm, that’s insightful. I wonder who’s in charge of production quality? Anyone from Google have any input?

PeterMalkin commented 6 years ago

We are investigating the issue. But in re: production, the manufacturing line producing the units is the same for all kits. The chipsets from Intel Movidius have to pass the QA when they go off their manufacturing line. So I doubt Intel wouldnt catch a problem if there was one. On our production line we have manufacturing tests that cover all the aspects of the hardware. We test the boards before and after the components soldering, and have different set of tests for every part of the device functionality.

cmrigney commented 6 years ago

Thanks for the insight @PeterMalkin. I'm guessing you guys test the SD cards as well? I know we've had an issue with faulty SD cards at a company I worked at previously. We also had issues where our product worked well in production testing, but then it would fail when the customer got it (it turned out to be a voltage spike randomly frying the board upon plugging it in). Not suggesting anything like that here. All that was to say, I have empathy for the grief this is probably causing you guys.

PeterMalkin commented 6 years ago

@cmrigney, since we rely on raspberry pi to provide the power to the bonnet, I doubt it is likely we would have power spike issues. ESD is likely however - if a user accumulates a static charge during unboxing and zaps the board - that could cause damage. The sd cards - we have produced a verified good image for the factory, and they utilize a machine that writes all sdcards in parallel and checksums the written bits. So no - we do not test sdcards by booting with them. But we do test sdcards to be exactly the same. However, please keep in mind that all the software development that is happening for the kits now will be released through the updates. So please keep an eye out for new sdcard images at aiyprojects.withgoogle.com And thank you for being patient.

PeterMalkin commented 6 years ago

Hey,

we have hard time reproducing this on our end reliably. If any of you happen to see this again - would you be willing to mail the kit over? I will mail back a new kit and a $20 gift card to compensate for shipping. That would really help us to get to the bottom of this.

Thanks!!

lleonid commented 6 years ago

@saket424, @bryantqo what's the exact set up when you're running into those issues? In particular: Is display connected to mini hdmi port on Pi? Is Pi connected to wifi and/or connected via micro usb data port? What does your camera face? I.e. your face, monitor, wall? Did you attempt to disassemble the kit? It's quite easy to damage vision bonnet when removing white standoffs from it. If that happens again could you run 'vcgencmd measure_temp' on Pi and report back the output please?

@bryantqo, do you also have Pi connected to 2.1+A usb power supply?

bryantqo commented 6 years ago

Yes power supply is sufficient. I will attempt to run the command however I think the last time I ran it I ended up with a kernel panic which flat lined the system. If I ssh into the board and stop the joy demo before the crash the os keeps running.

HDMI: No WiFi: Yes Power: power micro USB Disassembled: No Facing the desk

On Tue, May 8, 2018, 5:58 PM Leonid Lobachev notifications@github.com wrote:

@saket424 https://github.com/saket424, @bryantqo https://github.com/bryantqo what's the exact set up when you're running into those issues? In particular: Is display connected to mini hdmi port on Pi? Is Pi connected to wifi and/or connected via micro usb data port? What does your camera face? I.e. your face, monitor, wall? Did you attempt to disassemble the kit? It's quite easy to damage vision bonnet when removing white standoffs from it. If that happens again could you run 'vcgencmd measure_temp' on Pi and report back the output please?

@bryantqo https://github.com/bryantqo, do you also have Pi connected to 2.1+A usb power supply?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/google/aiyprojects-raspbian/issues/346#issuecomment-387555626, or mute the thread https://github.com/notifications/unsubscribe-auth/AKFuv71RqS1aZ8bBHdaJTpoaMot7_TkBks5twhUSgaJpZM4TqiA2 .

ScottBriening commented 6 years ago

I am experiencing the same thing. @PeterMalkin, I am willing to mail you my vision kit rather than exchange it at Target. (For Science. And for the community.) Overheating seems to be a likely culprit. I was debating on slapping heat sink on the zero to see if that would fix the problem, but will hold off until I hear back from you. In the meantime I'll work with the Voice Kit. Would love to try the Android Things setup. And DialogueFlow and Firebase and sill hafta try gVisor... So much to do, so little time!

PeterMalkin commented 6 years ago

@ScottBriening It would be extremely kind of you to mail your kit here. I will mail you a new one back. Would you be so kind as to shoot me an email at petermalkin@google.com, and we coordinate the exchange? Thank you!

dtreskunov commented 6 years ago

I'm also having an issue with my Vision Kit where very shortly after starting the joy detection demo my ssh session becomes unresponsive. It seems related to the board getting too hot (which board?):

pi@raspberrypi:~ $ while true; do vcgencmd measure_temp;  sleep 5; done;
temp=43.3'C
temp=43.3'C
temp=44.4'C
temp=43.9'C
temp=43.3'C
temp=43.3'C
temp=44.9'C <- start joy detector demo
temp=46.5'C
temp=49.2'C
temp=49.8'C
temp=48.7'C
temp=49.2'C
temp=50.3'C
temp=51.4'C
temp=50.8'C <- chime sounds
temp=49.8'C
temp=51.4'C
temp=50.8'C
temp=53.0'C <- last reading printed to the console before I have to cut the power

I should note that I was seeing oops errors in dmesg output (and the demo stopped working but the OS seemed kind of OK-ish) before I switched the power supply from 1.5A to 2.1A. After this change, I can't even check dmesg output because nothing works anymore.

dmitriykovalev commented 6 years ago

@dtreskunov What kind of oops have you seen? Could you attach HDMI monitor to get the exact error message?

You max temperature (53.0 C) is actually pretty low. In my experiments VisionKit was working fine when Pi was at 90.0 C. Everything below 85.0 C is considered normal I guess.

dtreskunov commented 6 years ago

@dmitriykovalev Here's the dmesg output after joy_detection_demo.py stops working (shortly after being started):

[  296.624859] Starting timer (pulse).
[  296.625076] Starting timer (period).
[  296.625291] Starting timer (pulse).
[  296.625523] Starting timer (period).
[  296.701112] Stopping timer.
[  296.719119] Unregistered device pwm22
[  296.804532] Unable to handle kernel NULL pointer dereference at virtual address 00000072
[  296.804578] pgd = d0100000
[  296.804589] [00000072] *pgd=101cc831, *pte=00000000, *ppte=00000000
[  296.804624] Internal error: Oops: 17 [#1] ARM
[  296.804638] Modules linked in: aiy_adc(O) pwm_aiy_io(O) gpio_aiy_io(O) industrialio cmac rfcomm bnep leds_ktd202x(O) aiy_io_i2c(O) hci_uart btbcm serdev bluetooth ecdh_generic aiy_vision(O) spidev usb_f_rndis u_ether usb_f_acm u_serial brcmfmac brcmutil cfg80211 rfkill snd_soc_bcm2835_i2s regmap_mmio i2c_bcm2835 snd_soc_core snd_compress snd_pcm_dmaengine snd_pcm snd_timer spi_bcm2835 snd uio_pdrv_genirq fixed uio uinput cuse fuse pwm_soft(O) i2c_dev libcomposite dwc2 udc_core ip_tables x_tables ipv6
[  296.804823] CPU: 0 PID: 1040 Comm: python3 Tainted: G           O    4.14.34+ #1110
[  296.804833] Hardware name: BCM2835
[  296.804845] task: cf72e120 task.stack: d01c6000
[  296.804894] PC is at free_pcppages_bulk+0x274/0x498
[  296.804910] LR is at 0xd7ca70b4
[  296.804919] pc : [<c00fb068>]    lr : [<d7ca70b4>]    psr: 60000093
[  296.804930] sp : d01c7db8  ip : d7ca70a0  fp : d01c7e04
[  296.804938] r10: 00000000  r9 : d7ca710c  r8 : 00000001
[  296.804950] r7 : ffffffff  r6 : 00000ce8  r5 : c09be3d0  r4 : c09be368
[  296.804962] r3 : 00000034  r2 : c09be368  r1 : 00000002  r0 : d7cb7494
[  296.804978] Flags: nZCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
[  296.804988] Control: 00c5387d  Table: 10100008  DAC: 00000055
[  296.805000] Process python3 (pid: 1040, stack limit = 0xd01c6188)
[  296.805011] Stack: (0xd01c7db8 to 0xd01c8000)
[  296.805024] 7da0:                                                       00000000 00000001
[  296.805039] 7dc0: d7ca710c c09be368 d7c72af8 d7c72aec 0000001c 0000001c d881f000 60000013
[  296.805054] 7de0: d7c72aec 0000001f c09be368 00000000 00000000 d7cad2a4 d01c7e44 d01c7e08
[  296.805070] 7e00: c00fd0ac c00fae00 d01c7e2c d01c7e18 c0136388 c09be368 00000002 00000000
[  296.805085] 7e20: d514e680 dbf89000 00000001 00000018 d010a4c0 d010a4dc d01c7e5c d01c7e48
[  296.805100] 7e40: c00fd13c c00fce94 d881f000 00000565 d01c7e7c d01c7e60 c0137bdc c00fd0f8
[  296.805116] 7e60: d010a4c0 00000000 000001eb ae214018 d01c7e8c d01c7e80 c0137cfc c0137b84
[  296.805131] 7e80: d01c7ea4 d01c7e90 bf433ba8 c0137cb4 d6ae6b00 00000000 d01c7f0c d01c7ea8
[  296.805147] 7ea0: bf4354cc bf433b38 d01d8de0 d6ae6b70 d6ae6b80 ae214008 c004c360 00000018
[  296.805163] 7ec0: 00001388 00c00010 000000a3 00000000 cf72e120 c00531d4 d01c7ed8 d01c7ed8
[  296.805180] 7ee0: c0046aec ae214008 d39a5c10 d01f55a0 00000006 00000006 d01c6000 00000000
[  296.805196] 7f00: d01c7f7c d01c7f10 c016c858 bf434fdc d01c7f74 d01c7f20 c06455ac c0046a9c
[  296.805210] 7f20: 00000001 00400000 fffffeb8 0000000a 1554f240 c01774c4 d0187600 00c00010
[  296.805226] 7f40: ae214008 bebdce54 c0108903 00000006 d01c7f6c d01f55a1 ae214008 d01f55a0
[  296.805241] 7f60: c0108903 00000006 d01c6000 00000000 d01c7fa4 d01c7f80 c016cf54 c016c7c8
[  296.805256] 7f80: bebdce54 00c00010 00378000 bebdce54 00000036 c000ff64 00000000 d01c7fa8
[  296.805273] 7fa0: c000fdc0 c016cf1c 00c00010 00378000 00000006 c0108903 ae214008 bebdce54
[  296.805289] 7fc0: 00c00010 00378000 bebdce54 00000036 00000006 c0108903 ae214008 bebdce58
[  296.805305] 7fe0: 003782fc bebdce44 000362d4 b6df080c 80000010 00000006 00000000 00000000
[  296.805349] [<c00fb068>] (free_pcppages_bulk) from [<c00fd0ac>] (free_hot_cold_page+0x224/0x264)
[  296.805371] [<c00fd0ac>] (free_hot_cold_page) from [<c00fd13c>] (__free_pages+0x50/0x54)
[  296.805402] [<c00fd13c>] (__free_pages) from [<c0137bdc>] (__vunmap+0x64/0xe0)
[  296.805422] [<c0137bdc>] (__vunmap) from [<c0137cfc>] (vfree+0x54/0x94)
[  296.805468] [<c0137cfc>] (vfree) from [<bf433ba8>] (transaction_unref+0x7c/0xa4 [aiy_vision])
[  296.805528] [<bf433ba8>] (transaction_unref [aiy_vision]) from [<bf4354cc>] (visionbonnet_ioctl+0x4fc/0x6a8 [aiy_vision])
[  296.805559] [<bf4354cc>] (visionbonnet_ioctl [aiy_vision]) from [<c016c858>] (do_vfs_ioctl+0x9c/0x754)
[  296.805580] [<c016c858>] (do_vfs_ioctl) from [<c016cf54>] (SyS_ioctl+0x44/0x6c)
[  296.805611] [<c016cf54>] (SyS_ioctl) from [<c000fdc0>] (ret_fast_syscall+0x0/0x28)
[  296.805632] Code: e58c0014 e7842003 e3a03034 e51b2040 (e0212193)
[  296.805650] ---[ end trace ee6907230b405e54 ]---
dtreskunov commented 6 years ago

Running a stress test on my Raspberry Pi doesn't result in any sort of badness. I think it's a hardware problem with the Vision Bonnet. I'm planning to return my AIY Vision Kit for a replacement.

pi@raspberrypi:~ $ stress --help
`stress' imposes certain types of compute stress on your system

Usage: stress [OPTION [ARG]] ...
 -?, --help         show this help statement
     --version      show version statement
 -v, --verbose      be verbose
 -q, --quiet        be quiet
 -n, --dry-run      show what would have been done
 -t, --timeout N    timeout after N seconds
     --backoff N    wait factor of N microseconds before work starts
 -c, --cpu N        spawn N workers spinning on sqrt()
 -i, --io N         spawn N workers spinning on sync()
 -m, --vm N         spawn N workers spinning on malloc()/free()
     --vm-bytes B   malloc B bytes per vm worker (default is 256MB)
     --vm-stride B  touch a byte every B bytes (default is 4096)
     --vm-hang N    sleep N secs before free (default none, 0 is inf)
     --vm-keep      redirty memory instead of freeing and reallocating
 -d, --hdd N        spawn N workers spinning on write()/unlink()
     --hdd-bytes B  write B bytes per hdd worker (default is 1GB)

Example: stress --cpu 8 --io 4 --vm 2 --vm-bytes 128M --timeout 10s

Note: Numbers may be suffixed with s,m,h,d,y (time) or B,K,M,G (size).
pi@raspberrypi:~ $ stress --timeout 60s -c 2 -i 2 -m 1 -d 1
stress: info: [1294] dispatching hogs: 2 cpu, 2 io, 1 vm, 1 hdd
stress: info: [1294] successful run completed in 61s

pi@raspberrypi:~ $ stress --timeout 120s --cpu 2 --io 2 --vm 2 --vm-bytes 128M --hdd 2
stress: info: [1630] dispatching hogs: 2 cpu, 2 io, 2 vm, 2 hdd
stress: info: [1630] successful run completed in 123s
dmitriykovalev commented 6 years ago

@dtreskunov Could you try to add

over_voltage=4
over_voltage_min=4

at the beginning of /boot/config.txt? I'm still checking but looks like it helps to eliminate similar problem. More context is here https://www.raspberrypi.org/forums/viewtopic.php?f=43&t=212777

dtreskunov commented 6 years ago

echo over_voltage=4 | sudo tee -a /boot/config.txt && sudo shutdown -r now

@dmitriykovalev Thanks so much! You just saved me a trip to the store. The problem seems to be gone after adding over_voltage=4 to /boot/config.txt.

At idle, the CPU clock is 700MHz and voltage is 1.2V. Whilejoy_detection_demo.py is running, the clock goes up to 1GHz and voltage to 1.4V. Before the /boot/config.txt change, the voltage would not increase when CPU was throttled up. For the record, this Raspberry Pi is reporting temps around 65'C while the demo is running.

For my own reference, here's some documentation on how to take these measurements using the vcgencmd utility.

dmitriykovalev commented 6 years ago

@dtreskunov Great, I'm glad this solution helped! @ScottBriening, thank you very much for sending your board to us for experiments, it has the same issue and the same fix removed the problem. We still need time to understand why :)

It's interesting that you didn't use over_voltage_min which affects dynamic frequency clocking, https://www.raspberrypi.org/documentation/configuration/config-txt/overclocking.md

ScottBriening commented 6 years ago

@dmitriykovalev @PeterMalkin YAY!!! I just sent you an email suggesting a similar solution, but it looks like you beat me to the punch. Nice work guys. Now let's make some mAgIc!

ScottBriening commented 6 years ago

@dmitriykovalev This is interesting indeed. So it's just the camera with pi0 combo that is the problem. I'm surprised this hasn't popped up more often / addressed by the folks at Pi.

From the raspberrypi.org docs, the default over_voltage setting for the pi zero is 6 (1.35V) and over_voltage_min is 0. By setting over_voltage to 4 (1.3V) it should decrease the voltage supplied by 0.05V. Setting over_voltage_min to 4 should ensure the voltage supplied goes no lower than 1.3 when the system is idle. It's great that these settings allow cameras to work with pi0, but... Why? (As it says in the link you provided "I don't know what to believe".)

Would love to tinker. Frying a couple pi0s isn't a big deal, but I certainly don't wanna lose the vision hat! Oh and you're welcome, of course. Happy to help! Got the new kit put together and so far so good. Oh, save the buzzer. Either I don't have enough joy or it's a dud. I'll get around to replacing it sometime.

ajaymdesai commented 6 years ago

I had this problem but it's fixed based on the over_voltage fix. Thanks.

phillipeloher commented 6 years ago

Same issue here but not fixed with over_voltage. Returning soon.

dmitriykovalev commented 6 years ago

@phillipeloher If possible, can you provide any additional details about your crash? e.g. dmesg output

joshkapple commented 6 years ago

I'm getting the same issue if I allow the joy detection demo to run after bootup. I have HDMI connected to a monitor and mini unpowered usb hub to connect my logitech wireless keyboard/trackpad (1 usb dongle connected).

Same issue after applying the voltage fix to /boot/config.txt.

dmesg output with kernel error right before it locked up below. I bought this from target about a month ago and just got around to playing with it today.

Running apt-get upgrade to see if that helps.

pi@raspberrypi:~ $ dmesg [ 0.000000] Booting Linux on physical CPU 0x0 [ 0.000000] Linux version 4.9.59+ (dc4@dc4-XPS13-9333) (gcc version 4.9.3 (crosstool-NG crosstool-ng-1.22.0-88-g8460611) ) #1047 Sun Oct 29 11:47:10 GMT 2017 [ 0.000000] CPU: ARMv6-compatible processor [410fb767] revision 7 (ARMv7), cr=00c5387d [ 0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT nonaliasing instruction cache [ 0.000000] OF: fdt:Machine model: Raspberry Pi Zero W Rev 1.1 [ 0.000000] cma: Reserved 8 MiB at 0x17400000 [ 0.000000] Memory policy: Data cache writeback [ 0.000000] On node 0 totalpages: 98304 [ 0.000000] free_area_init_node: node 0, pgdat c0914e10, node_mem_map d7c8b900 [ 0.000000] Normal zone: 864 pages used for memmap [ 0.000000] Normal zone: 0 pages reserved [ 0.000000] Normal zone: 98304 pages, LIFO batch:31 [ 0.000000] pcpu-alloc: s0 r0 d32768 u32768 alloc=1*32768 [ 0.000000] pcpu-alloc: [0] 0 [ 0.000000] Built 1 zonelists in Zone order, mobility grouping on. Total pages: 97440 [ 0.000000] Kernel command line: 8250.nr_uarts=0 bcm2708_fb.fbwidth=1824 bcm2708_fb.fbheight=984 bcm2708_fb.fbswap=1 smsc95xx.macaddr=B8:27:EB:E3:F2:6A vc_mem.mem_base=0x1ec00000 vc_mem.mem_size=0x20000000 dwc_otg.lpm_enable=0 console=ttyS0,115200 console=tty1 root=PARTUUID=86b6fce0-02 rootfstype=ext4 elevator=deadline fsck.repair=yes rootwait quiet splash plymouth.ignore-serial-consoles [ 0.000000] PID hash table entries: 2048 (order: 1, 8192 bytes) [ 0.000000] Dentry cache hash table entries: 65536 (order: 6, 262144 bytes) [ 0.000000] Inode-cache hash table entries: 32768 (order: 5, 131072 bytes) [ 0.000000] Memory: 370980K/393216K available (5950K kernel code, 491K rwdata, 1948K rodata, 396K init, 725K bss, 14044K reserved, 8192K cma-reserved) [ 0.000000] Virtual kernel memory layout: vector : 0xffff0000 - 0xffff1000 ( 4 kB) fixmap : 0xffc00000 - 0xfff00000 (3072 kB) vmalloc : 0xd8800000 - 0xff800000 ( 624 MB) lowmem : 0xc0000000 - 0xd8000000 ( 384 MB) modules : 0xbf000000 - 0xc0000000 ( 16 MB) .text : 0xc0008000 - 0xc05d7a48 (5951 kB) .init : 0xc0841000 - 0xc08a4000 ( 396 kB) .data : 0xc08a4000 - 0xc091ef48 ( 492 kB) .bss : 0xc091ef48 - 0xc09d4648 ( 726 kB) [ 0.000000] SLUB: HWalign=32, Order=0-3, MinObjects=0, CPUs=1, Nodes=1 [ 0.000000] NR_IRQS:16 nr_irqs:16 16 [ 0.000031] sched_clock: 32 bits at 1000kHz, resolution 1000ns, wraps every 2147483647500ns [ 0.000063] clocksource: timer: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 1911260446275 ns [ 0.000155] bcm2835: system timer (irq = 27) [ 0.000631] Console: colour dummy device 80x30 [ 0.000656] console [tty1] enabled [ 0.000681] Calibrating delay loop... 697.95 BogoMIPS (lpj=3489792) [ 0.060313] pid_max: default: 32768 minimum: 301 [ 0.060740] Mount-cache hash table entries: 1024 (order: 0, 4096 bytes) [ 0.060756] Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes) [ 0.062002] Disabling memory control group subsystem [ 0.062138] CPU: Testing write buffer coherency: ok [ 0.062195] ftrace: allocating 21715 entries in 64 pages [ 0.180401] Setting up static identity map for 0x8200 - 0x8238 [ 0.182376] devtmpfs: initialized [ 0.192676] VFP support v0.3: implementor 41 architecture 1 part 20 variant b rev 5 [ 0.193104] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns [ 0.193131] futex hash table entries: 256 (order: -1, 3072 bytes) [ 0.194409] pinctrl core: initialized pinctrl subsystem [ 0.195840] NET: Registered protocol family 16 [ 0.198387] DMA: preallocated 1024 KiB pool for atomic coherent allocations [ 0.208081] hw-breakpoint: found 6 breakpoint and 1 watchpoint registers. [ 0.208098] hw-breakpoint: maximum watchpoint size is 4 bytes. [ 0.208200] Serial: AMBA PL011 UART driver [ 0.211124] bcm2835-mbox 2000b880.mailbox: mailbox enabled [ 0.211851] uart-pl011 20201000.serial: could not find pctldev for node /soc/gpio@7e200000/uart0_pins, deferring probe [ 0.262757] bcm2835-dma 20007000.dma: DMA legacy API manager at d880d000, dmachans=0x1 [ 0.265428] SCSI subsystem initialized [ 0.265674] usbcore: registered new interface driver usbfs [ 0.265785] usbcore: registered new interface driver hub [ 0.265982] usbcore: registered new device driver usb [ 0.270451] raspberrypi-firmware soc:firmware: Attached to firmware from 2017-08-08 12:05 [ 0.272469] clocksource: Switched to clocksource timer [ 0.326617] VFS: Disk quotas dquot_6.6.0 [ 0.326736] VFS: Dquot-cache hash table entries: 1024 (order 0, 4096 bytes) [ 0.327059] FS-Cache: Loaded [ 0.327405] CacheFiles: Loaded [ 0.346835] NET: Registered protocol family 2 [ 0.348208] TCP established hash table entries: 4096 (order: 2, 16384 bytes) [ 0.348298] TCP bind hash table entries: 4096 (order: 2, 16384 bytes) [ 0.348392] TCP: Hash tables configured (established 4096 bind 4096) [ 0.348489] UDP hash table entries: 256 (order: 0, 4096 bytes) [ 0.348516] UDP-Lite hash table entries: 256 (order: 0, 4096 bytes) [ 0.348814] NET: Registered protocol family 1 [ 0.349558] RPC: Registered named UNIX socket transport module. [ 0.349570] RPC: Registered udp transport module. [ 0.349575] RPC: Registered tcp transport module. [ 0.349580] RPC: Registered tcp NFSv4.1 backchannel transport module. [ 0.350847] hw perfevents: enabled with armv6_1176 PMU driver, 3 counters available [ 0.353419] workingset: timestamp_bits=14 max_order=17 bucket_order=3 [ 0.374118] FS-Cache: Netfs 'nfs' registered for caching [ 0.375890] NFS: Registering the id_resolver key type [ 0.375937] Key type id_resolver registered [ 0.375943] Key type id_legacy registered [ 0.380404] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 251) [ 0.380768] io scheduler noop registered [ 0.380781] io scheduler deadline registered (default) [ 0.381256] io scheduler cfq registered [ 0.387195] BCM2708FB: allocated DMA memory 57500000 [ 0.387256] BCM2708FB: allocated DMA channel 0 @ d880d000 [ 0.432466] Console: switching to colour frame buffer device 228x61 [ 0.469160] bcm2835-rng 20104000.rng: hwrng registered [ 0.469340] vc-mem: phys_addr:0x00000000 mem_base=0x1ec00000 mem_size:0x20000000(512 MiB) [ 0.470382] vc-sm: Videocore shared memory driver [ 0.498331] brd: module loaded [ 0.512140] loop: module loaded [ 0.512164] Loading iSCSI transport class v2.0-870. [ 0.513313] usbcore: registered new interface driver smsc95xx [ 0.513347] dwc_otg: version 3.00a 10-AUG-2012 (platform bus) [ 0.513557] dwc_otg: FIQ enabled [ 0.513567] dwc_otg: NAK holdoff enabled [ 0.513574] dwc_otg: FIQ split-transaction FSM enabled [ 0.513594] Module dwc_common_port init [ 0.514072] usbcore: registered new interface driver usb-storage [ 0.514601] mousedev: PS/2 mouse device common for all mice [ 0.516425] bcm2835-wdt 20100000.watchdog: Broadcom BCM2835 watchdog timer [ 0.517059] bcm2835-cpufreq: min=700000 max=1000000 [ 0.517759] sdhci: Secure Digital Host Controller Interface driver [ 0.517768] sdhci: Copyright(c) Pierre Ossman [ 0.518276] sdhost-bcm2835 20202000.sdhost: could not get clk, deferring probe [ 0.520742] mmc-bcm2835 20300000.mmc: could not get clk, deferring probe [ 0.520932] sdhci-pltfm: SDHCI platform and OF driver helper [ 0.521542] ledtrig-cpu: registered to indicate activity on CPUs [ 0.521730] hidraw: raw HID events driver (C) Jiri Kosina [ 0.522027] usbcore: registered new interface driver usbhid [ 0.522037] usbhid: USB HID core driver [ 0.523542] vchiq: vchiq_init_state: slot_zero = 0xd7580000, is_master = 0

[ 0.536459] [vc_sm_connected_init]: end - returning 0 [ 0.537022] Initializing XFRM netlink socket [ 0.537072] NET: Registered protocol family 17 [ 0.537245] Key type dns_resolver registered [ 0.539339] registered taskstats version 1 [ 0.549255] uart-pl011 20201000.serial: cts_event_workaround enabled [ 0.549404] 20201000.serial: ttyAMA0 at MMIO 0x20201000 (irq = 81, base_baud = 0) is a PL011 rev2 [ 0.552082] sdhost: log_buf @ d7510000 (57510000) [ 0.632574] mmc0: sdhost-bcm2835 loaded - DMA enabled (>1) [ 0.635186] mmc-bcm2835 20300000.mmc: mmc_debug:0 mmc_debug2:0 [ 0.635202] mmc-bcm2835 20300000.mmc: DMA channel allocated [ 0.671779] random: fast init done [ 0.710087] mmc0: host does not support reading read-only switch, assuming write-enable [ 0.712230] mmc0: new high speed SDHC card at address 59b4 [ 0.712703] of_cfs_init [ 0.712843] of_cfs_init: OK [ 0.713845] Waiting for root device PARTUUID=86b6fce0-02... [ 0.715399] mmcblk0: mmc0:59b4 SD 7.36 GiB [ 0.721445] mmcblk0: p1 p2 [ 0.732810] mmc1: queuing unknown CIS tuple 0x80 (2 bytes) [ 0.734565] mmc1: queuing unknown CIS tuple 0x80 (3 bytes) [ 0.736272] mmc1: queuing unknown CIS tuple 0x80 (3 bytes) [ 0.739406] mmc1: queuing unknown CIS tuple 0x80 (7 bytes) [ 0.835608] EXT4-fs (mmcblk0p2): INFO: recovery required on readonly filesystem [ 0.835623] EXT4-fs (mmcblk0p2): write access will be enabled during recovery [ 0.843978] mmc1: new high speed SDIO card at address 0001 [ 2.638818] EXT4-fs (mmcblk0p2): orphan cleanup on readonly fs [ 2.639434] EXT4-fs (mmcblk0p2): 1 orphan inode deleted [ 2.639450] EXT4-fs (mmcblk0p2): recovery complete [ 2.906249] EXT4-fs (mmcblk0p2): mounted filesystem with ordered data mode. Opts: (null) [ 2.906361] VFS: Mounted root (ext4 filesystem) readonly on device 179:2. [ 2.907631] devtmpfs: mounted [ 2.909057] Freeing unused kernel memory: 396K [ 2.909067] This architecture does not have kernel memory protection. [ 3.409164] systemd[1]: System time before build time, advancing clock. [ 3.575080] NET: Registered protocol family 10 [ 3.590495] ip_tables: (C) 2000-2006 Netfilter Core Team [ 3.646370] systemd[1]: systemd 232 running in system mode. (+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN) [ 3.647561] systemd[1]: Detected architecture arm. [ 3.649295] systemd[1]: Set hostname to . [ 4.744148] systemd[1]: Configuration file /opt/aiy/io-mcu-firmware/aiy_io_permission.service is marked executable. Please remove executable permission bits. Proceeding anyway. [ 4.845581] systemd[1]: Set up automount Arbitrary Executable File Formats File System Automount Point. [ 4.846895] systemd[1]: Started Forward Password Requests to Wall Directory Watch. [ 4.847832] systemd[1]: Listening on Syslog Socket. [ 4.848788] systemd[1]: Listening on udev Control Socket. [ 4.849785] systemd[1]: Listening on Journal Socket. [ 4.850759] systemd[1]: Listening on /dev/initctl Compatibility Named Pipe. [ 5.166453] 20980000.usb supply vusb_d not found, using dummy regulator [ 5.166640] 20980000.usb supply vusb_a not found, using dummy regulator [ 5.622760] dwc2 20980000.usb: EPs: 8, dedicated fifos, 4080 entries in SPRAM [ 5.623847] dwc2 20980000.usb: DWC OTG Controller [ 5.623936] dwc2 20980000.usb: new USB bus registered, assigned bus number 1 [ 5.624044] dwc2 20980000.usb: irq 33, io mem 0x00000000 [ 5.624561] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002 [ 5.624584] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1 [ 5.624595] usb usb1: Product: DWC OTG Controller [ 5.624605] usb usb1: Manufacturer: Linux 4.9.59+ dwc2_hsotg [ 5.624616] usb usb1: SerialNumber: 20980000.usb [ 5.626072] hub 1-0:1.0: USB hub found [ 5.626186] hub 1-0:1.0: 1 port detected [ 5.689818] i2c /dev entries driver [ 5.714881] pwm_soft: loading out-of-tree module taints kernel. [ 5.716113] SoftPWM v0.1 initializing. [ 5.716130] Clock resolution is 1ns [ 5.716233] SoftPWM initialized. [ 6.062697] usb 1-1: new high-speed USB device number 2 using dwc2 [ 6.303230] usb 1-1: New USB device found, idVendor=1a40, idProduct=0101 [ 6.303259] usb 1-1: New USB device strings: Mfr=0, Product=1, SerialNumber=0 [ 6.303272] usb 1-1: Product: USB 2.0 Hub [ 6.304660] hub 1-1:1.0: USB hub found [ 6.304844] hub 1-1:1.0: 4 ports detected [ 6.632820] usb 1-1.3: new full-speed USB device number 3 using dwc2 [ 6.766519] usb 1-1.3: New USB device found, idVendor=046d, idProduct=c52b [ 6.766565] usb 1-1.3: New USB device strings: Mfr=1, Product=2, SerialNumber=0 [ 6.766578] usb 1-1.3: Product: USB Receiver [ 6.766589] usb 1-1.3: Manufacturer: Logitech [ 7.527500] EXT4-fs (mmcblk0p2): re-mounted. Opts: (null) [ 7.840505] random: crng init done [ 8.286784] systemd-journald[93]: Received request to flush runtime journal from PID 1 [ 11.123374] logitech-djreceiver 0003:046D:C52B.0003: hiddev0,hidraw0: USB HID v1.11 Device [Logitech USB Receiver] on usb-20980000.usb-1.3/input2 [ 11.836986] gpiomem-bcm2835 20200000.gpiomem: Initialised: Registers at 0x20200000 [ 13.293007] brcmfmac: F1 signature read @0x18000000=0x1541a9a6 [ 13.343613] usbcore: registered new interface driver brcmfmac [ 13.755211] brcmfmac: Firmware version = wl0: Aug 7 2017 00:46:29 version 7.45.41.46 (r666254 CY) FWID 01-f8a78378 [ 13.771366] brcmfmac: brcmf_c_preinit_dcmds: CLM version = API: 12.2 Data: 7.11.15 Compiler: 1.24.2 ClmImport: 1.24.1 Creation: 2014-05-26 10:53:55 Inc Data: 9.10.41 Inc Compiler: 1.29.4 Inc ClmImport: 1.36.3 Creation: 2017-08-07 00:37:47 [ 19.545655] uart-pl011 20201000.serial: no DMA platform data [ 23.066133] using random self ethernet address [ 23.066157] using random host ethernet address [ 23.206184] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready [ 23.206229] brcmfmac: power management disabled [ 24.431912] Bluetooth: Core ver 2.22 [ 24.432115] NET: Registered protocol family 31 [ 24.432128] Bluetooth: HCI device and connection manager initialized [ 24.432163] Bluetooth: HCI socket layer initialized [ 24.432187] Bluetooth: L2CAP socket layer initialized [ 24.432267] Bluetooth: SCO socket layer initialized [ 24.566516] Bluetooth: HCI UART driver ver 2.3 [ 24.566542] Bluetooth: HCI UART protocol H4 registered [ 24.566550] Bluetooth: HCI UART protocol Three-wire (H5) registered [ 24.566841] Bluetooth: HCI UART protocol Broadcom registered [ 27.196455] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready [ 27.655333] Bluetooth: BNEP (Ethernet Emulation) ver 1.3 [ 27.655380] Bluetooth: BNEP filters: protocol multicast [ 27.655426] Bluetooth: BNEP socket layer initialized [ 28.185318] Bluetooth: RFCOMM TTY layer initialized [ 28.185389] Bluetooth: RFCOMM socket layer initialized [ 28.185435] Bluetooth: RFCOMM ver 1.11 [ 29.200689] usb0: HOST MAC 66:1e:0a:fd:6e:17 [ 29.202115] usb0: MAC 72:7a:d3:bc:8a:3d [ 29.202426] dwc2 20980000.usb: bound driver configfs-gadget [ 30.604354] input: Logitech K400 as /devices/platform/soc/20980000.usb/usb1/1-1/1-1.3/1-1.3:1.2/0003:046D:C52B.0003/0003:046D:4024.0004/input/input0 [ 31.147699] logitech-hidpp-device 0003:046D:4024.0004: input,hidraw1: USB HID v1.11 Keyboard [Logitech K400] on usb-20980000.usb-1.3:1 [ 31.243924] IPv6: ADDRCONF(NETDEV_UP): usb0: link is not ready [ 32.648433] aiy-vision spi0.0: Initializing [ 32.702201] aiy-vision spi0.0: Failed to bind reset GPIO [ 35.923713] aiy-io-i2c 1-0051: Setting board type vision [ 35.925095] aiy-io-i2c 1-0051: Driver loaded [ 35.926843] aiy-vision spi0.0: Initializing [ 35.943703] aiy-vision spi0.0: Failed to bind reset GPIO [ 36.436122] ktd202x 1-0030: Driver loaded for a ktd2026. [ 36.437819] aiy-vision spi0.0: Initializing [ 36.453802] aiy-vision spi0.0: Failed to bind reset GPIO [ 42.397701] pwm-aiy-io pwm-aiy-io: Driver loaded [ 42.399465] aiy-vision spi0.0: Initializing [ 42.422337] aiy-vision spi0.0: Failed to bind reset GPIO [ 42.772162] gpio-aiy-io gpio-aiy-io: Driver loaded [ 42.774079] aiy-vision spi0.0: Initializing [ 42.806350] aiy-vision spi0.0: Resetting myriad on probe [ 42.806394] aiy-vision spi0.0: Resetting myriad [ 43.864664] aiy-adc aiy-adc: Vision bonnet ADC configuration. [ 44.056871] aiy-adc aiy-adc: Driver loaded [ 44.582326] systemd[1]: Found device /dev/ttyGS0. [ 45.471692] aiy-vision spi0.0: Writing myriad firmware [ 52.501476] aiy-vision spi0.0: Myriad booting [ 52.728663] aiy-vision spi0.0: Myriad ready [ 55.487494] Adding 102396k swap on /var/swap. Priority:-1 extents:1 across:102396k SSFS [ 86.140200] fuse init (API version 7.26) [ 105.037818] Registered device pwm22 [ 105.125001] Starting timer (pulse). [ 105.125152] Stopping timer. [ 105.125442] Starting timer (period). [ 105.125745] Starting timer (pulse). [ 105.126037] Starting timer (period). [ 105.126317] Starting timer (pulse). [ 105.126592] Starting timer (period). [ 105.362886] Starting timer (pulse). [ 105.363487] Starting timer (period). [ 105.363878] Starting timer (pulse). [ 105.364312] Starting timer (period). [ 105.364642] Starting timer (pulse). [ 105.364956] Starting timer (period). [ 105.692825] Starting timer (pulse). [ 105.693125] Starting timer (period). [ 105.693478] Starting timer (pulse). [ 105.693774] Starting timer (period). [ 105.694075] Starting timer (pulse). [ 105.694357] Starting timer (period). [ 105.875738] Stopping timer. [ 105.901639] Unregistered device pwm22 Message from syslogd@raspberrypi at Jun 27 05:35:35 ... kernel:[ 140.161145] Internal error: Oops: 17 [#1] ARM Message from syslogd@raspberrypi at Jun 27 05:35:35 ... kernel:[ 140.161512] Process python3 (pid: 240, stack limit = 0xd5348188)

Message from syslogd@raspberrypi at Jun 27 05:35:35 ... Message from syslogd@raspberrypi at Jun 27 05:35:35 ...) kernel:[ 140.161710] 9ee0: 00000008 00000008 d5348000 00000000 d5349f7c d5349f00 c01575f0 bf4c2fdc packet_write_wait: Connection to 192.168.0.104 port 22: Broken pipe kernel:[ 140.161725] 9f00: d5349f2c d5349f10 c005ec28 c0061220 00000020 00000080 d701db40 d701db50 Message from syslogd@raspberrypi at Jun 27 05:35:35 ... Message from syslogd@raspberrypi at Jun 27 05:35:35 ... 60000113 d7c78564 0000001f c0914e10 00000000 kernel:[ 140.161740] 9f20: d5349f44 d5349f30 c005ec70 c005ebdc 00144311 c016211c d5178480 00c00010 Message from syslogd@raspberrypi at Jun 27 05:35:35 ... Message from syslogd@raspberrypi at Jun 27 05:35:35 ... d5349df8 c00edfe8 c00ec1d0 00000619 ceebab80 kernel:[ 140.161755] 9f40: ae214008 becf9bd4 c0108903 00000008 d5349f6c cee0edc1 ae214008 cee0edc0 Message from syslogd@raspberrypi at Jun 27 05:35:35 ... Message from syslogd@raspberrypi at Jun 27 05:35:35 ... 00000000 ceebab80 da6e8000 00000001 00000018 kernel:[ 140.161770] 9f60: c0108903 00000008 d5348000 00000000 d5349fa4 d5349f80 c0157d6c c0157560 Message from syslogd@raspberrypi at Jun 27 05:35:35 ... Message from syslogd@raspberrypi at Jun 27 05:35:35 ... d5349e38 c00ee078 c00eddd0 d8804000 000006b4 kernel:[ 140.161785] 9f80: becf9bd4 00c00010 00378000 becf9bd4 00000036 c000ffc4 00000000 d5349fa8 Message from syslogd@raspberrypi at Jun 27 05:35:35 ... Message from syslogd@raspberrypi at Jun 27 05:35:35 ... c00ee034 ceca98c0 00000000 000005d6 ae214018 kernel:[ 140.161801] 9fa0: c000fe40 c0157d34 00c00010 00378000 00000008 c0108903 ae214008 becf9bd4 Message from syslogd@raspberrypi at Jun 27 05:35:35 ... Message from syslogd@raspberrypi at Jun 27 05:35:35 ... c01250c8 d5349e94 d5349e80 bf4c1c20 c01251f8 kernel:[ 140.161816] 9fc0: 00c00010 00378000 becf9bd4 00000036 00000008 c0108903 ae214008 becf9bd8 Message from syslogd@raspberrypi at Jun 27 05:35:35 ... Message from syslogd@raspberrypi at Jun 27 05:35:35 ... d5349e98 bf4c34cc bf4c1bb0 00000000 d6b28e70rev Page M-\ First Line M-W WhereIs Next^^ Mark Text M-} Indent Text kernel:[ 140.161830] 9fe0: 003782fc becf9bc4 000362d4 b6d9f8ac 80000010 00000008 6e616d72 005f5f74ext Page M-/ Last Line M-] To Bracket M-^ Copy Text M-{ Unindent Text Message from syslogd@raspberrypi at Jun 27 05:35:35 ... Message from syslogd@raspberrypi at Jun 27 05:35:35 ... 00000018 00003a98 00c00010 0000007e 00000000 kernel:[ 140.162219] Code: e58c0014 e7842003 e3a03034 e51b2040 (e0212193) Message from syslogd@raspberrypi at Jun 27 05:35:35 ... kernel:[ 140.161690] 9ec0: d5225f60 c004ff58 d5349ec8 d5349ec8 d5349f0c ae214008 d0062028 cee0edc0

burtbick commented 6 years ago

Check out my comment to #418 in the forum.

I was able to resolve the same type of problem at least for now by explicitly setting the CPU speed in config.txt to less than 1 GHz. Note: I had previously tried the over_voltage=4 "fix" that was suggested with no success.

The joy app is working and I'm not getting any lockups of kernel oops failures.

Hopefully this will help while Google tries to figure out if it is a bad batch of pi w boards or the bonnet that doesn't like to run at 1 GHz.

dmitriykovalev commented 6 years ago

@burtbick Thank you for the detailed comment! VisionBonnet has it's own SoC and clock source, so most likely that's a low level Pi issue (root cause is still unclear). Usually everything works just fine but some boards have issues. By the way, did you try over_voltage=4 instead of changing CPU frequency?

burtbick commented 6 years ago

Thanks Dmitriy,

Yes, I tried the over_voltage=4 setting first. That didn't make ANY difference at all. I also tried adding over_voltage_min=4 as was suggested in another thread and that also didn't make any difference.

The only thing that seems to help is setting the CPU speed to less than 1 GHz.. Note that the Pi W seems to keep running just fine at 1 GHz with other applications. It is only when the vision related applications are running that it starts choking.

If I kill the joy demo then I can run the Pi W at full speed and it seems to work with no kernel oops lock ups. By dropping the Pi speed then the vision demos appear to work just fine, but I haven't run for more than an hour. With everything back in the closed box and running a video app the temp appears to stabilize out at 62.7 degrees C.

When I get some more time I'll try swapping in a different Pi W and see if that makes any difference. And I plan on also trying a different SD card to see if that makes any difference.

lufiaraujo commented 6 years ago

My kit was freezing too, but right after boot if I opened terminal or anything else. Here's what I've done:

  1. Shutdown the pi
  2. Opened the box and disconnected the camera from the bonnet
  3. Booted up
  4. Added the "over_voltage" commands to /boot/config.txt:

sudo nano /boot/config.txt

add the following lines to the end of the file: _over_voltage=4 over_voltagemin=4

Ctrl+O (save); Ctrl+W (exit)

  1. updated the distro

sudo apt-get dist-upgrade (takes a little forever to download and install everything)

  1. Shutdown, reconnect the camera and put everything back in the box.

Now it's flawless!

dmitriykovalev commented 6 years ago

@lufiaraujo Right, that solution works and already described in https://github.com/google/aiyprojects-raspbian/blob/aiyprojects/docs/vision.md.

@burtbick Can you try to decrease CPU frequency (check #418 for details)? Looks like that also works sometimes.

seth-johnson-sp commented 5 years ago

As per the image titled "aiyprojects-2018-11-16.img.xz" I can confirm that the original problem of the OS crashing is solved by adding the following to /boot/config.txt

over_voltage=4 over_voltage_min=4

Do not run the upgrade afterwards. Don't run the upgrade before, either. If you upgrade raspbian you will lose the AIY software functionality with this AIY disk image.