cimryan / teslausb

Steps and scripts for turning a Raspberry Pi into a useful USB drive for a Tesla
MIT License
569 stars 483 forks source link

Drive disconnecting/disappearing after several hours of working fine - ideas? #128

Open jspv opened 5 years ago

jspv commented 5 years ago

On 2019.5 with therealmarcone's mods, everything working as expected until it doesn't. Two days in a row when I get to my car after being in Sentry mode for the day, the drive is not connected to the Tesla (no camera icon) and no recording is happening. archiveloop is still running though - exact same session, the pi never lost power or rebooted. When I get home it connects to WiFi and uploads files and after upload, the drive reconnects fine. By looking at the files, I can see that at some point in the day, the recordings just stopped.

Any ideas on how to debug this?

Merckle commented 5 years ago

Ya same, was wondering what I was doing wrong. Which I wonder if someone could write a Cron job to make sure it's running properly at all time and or if unmounted virtually to remount

ken830 commented 5 years ago

I did the workaround as well and it appeared to work on 2019.5.15 when I plug it in to test. I could access music from the drive and also can record to the drive, but I never see it copy to my NAS. The next day, I go back out to the car and the drive doesn't appear to work anymore. No music no cam recording (not even the X icon)

I rebuilt the drive from scratch with the workaround from the very beginning and plug it in to test. It once again seemed to work for music and cam recording, but I still don't see it copying anything to the NAS. I didn't have any more time to test that same day, but the next day, the drive once again doesn't appear to work anymore just as before -- no music or cam recording.

Later the same day, I get a FW update to 2019.8.2 and after the firmware update, I noticed that the Pi doesn't have any power -- No LED light. I'm not sure if this was the same behavior with 2019.5.15 or not. I plugged my phone into the same port and no power for the phone either. I switched to the other port and the Pi's LED lights up, but it doesn't work -- no music, no cam. But after awhile, I noticed the Pi's LED is off again. I try to plug in my phone into this port as well and now I have no power to either USB port. I reboot the MCU and power comes back to both. But the drive still doesn't work.

I haven't any time to debug any further, but the Pi's two drives appear fine when I plug into my Windows machine at home.

firedfly commented 5 years ago

@ken830 - I've seen the same issue with power on the pi. When the car boots, power is provided to the pi, but doesn't appear stable and the pi powers off after a minute. My phone will no longer charge on that port until I reboot the MCU.

I've discovered a workaround. I'm now powering the pi via the cigarette lighter outlet and using the front USB port for communication with the MCU. This setup has worked well this afternoon.

jspv commented 5 years ago

@ken830 - do you know if the lighter outlet says powered when the car is in Sentry mode?

ken830 commented 5 years ago

It's too bad there's no clean way to route a cable from the cig light outlet to the front USB compartment.

I tried to test the cig light power with sentry mode, but power is still present at the cig outlet and front USB when I manually lock the car. Power was still present when I turned off Bluetooth to simulate a walkway locking. I didn't have enough time to wait for the car to sleep. I also tried to toggle sentry mode on and off but power remained on for both USB and cig outlet. Not sure if this is new behavior or not on this fw release. I expected it to follow the behavior of my Model S (really early 2013) and should cut power to the cig outlet as soon as the doors closed with no one in the driver seat.

firedfly commented 5 years ago

I just ran into the issue again where the dashcam stops recording and there is no dashcam icon on screen in my model 3. As mentioned before, I'm powering the pi via the cigarette lighter to ensure sufficient power with the firmware version I'm on. The dashcam stopped while I was driving...after 35 minutes or so, I'd guess.

I ssh'd into the pi and found this information in dmesg. Perhaps it will mean something to someone here.

[ 31.087461] g_mass_storage gadget: high-speed config #1: Linux File-Backed Storage [ 146.408129] CIFS VFS: Server 192.168.10.18 has not responded in 120 seconds. Reconnecting... [ 146.419396] CIFS VFS: Free previous auth_key.response = da5c4900 [ 291.287162] random: crng init done [ 291.287181] random: 7 urandom warning(s) missed due to ratelimiting [ 3222.205121] CIFS VFS: Server 192.168.10.18 has not responded in 120 seconds. Reconnecting... [ 4089.093677] ------------[ cut here ]------------ [ 4089.093826] WARNING: CPU: 0 PID: 0 at drivers/usb/dwc2/gadget.c:300 dwc2_hsotg_init_fifo+0x1ac/0x1d0 [dwc2] [ 4089.093833] Modules linked in: g_mass_storage usb_f_mass_storage libcomposite cmac sha256_generic arc4 ecb md4 md5 hmac nls_utf8 cifs ccm bnep hci_uart btbcm serdev bluetooth ecdh_generic brcmfmac brcmutil snd_bcm2 835(C) snd_pcm cfg80211 snd_timer rfkill snd fixed uio_pdrv_genirq uio dwc2 udc_core ip_tables x_tables ipv6 [ 4089.093943] CPU: 0 PID: 0 Comm: swapper Tainted: G C 4.14.98+ #1200 [ 4089.093947] Hardware name: BCM2835 [ 4089.093995] [<c0016420>] (unwind_backtrace) from [<c0013d40>] (show_stack+0x20/0x24) [ 4089.094020] [<c0013d40>] (show_stack) from [<c0638de4>] (dump_stack+0x20/0x28) [ 4089.094041] [<c0638de4>] (dump_stack) from [<c0021ecc>] (__warn+0xe4/0x10c) [ 4089.094057] [<c0021ecc>] (__warn) from [<c0021fc0>] (warn_slowpath_null+0x30/0x38) [ 4089.094135] [<c0021fc0>] (warn_slowpath_null) from [<bf0b1aa8>] (dwc2_hsotg_init_fifo+0x1ac/0x1d0 [dwc2]) [ 4089.094262] [<bf0b1aa8>] (dwc2_hsotg_init_fifo [dwc2]) from [<bf0b44b4>] (dwc2_hsotg_core_init_disconnected+0x88/0x3ec [dwc2]) [ 4089.094378] [<bf0b44b4>] (dwc2_hsotg_core_init_disconnected [dwc2]) from [<bf0b4f0c>] (dwc2_hsotg_irq+0x6f4/0x7c8 [dwc2]) [ 4089.094447] [<bf0b4f0c>] (dwc2_hsotg_irq [dwc2]) from [<c006230c>] (__handle_irq_event_percpu+0x94/0x1c8) [ 4089.094460] [<c006230c>] (__handle_irq_event_percpu) from [<c006246c>] (handle_irq_event_percpu+0x2c/0x68) [ 4089.094472] [<c006246c>] (handle_irq_event_percpu) from [<c00624e0>] (handle_irq_event+0x38/0x4c) [ 4089.094487] [<c00624e0>] (handle_irq_event) from [<c0065e10>] (handle_level_irq+0xa0/0x114) [ 4089.094512] [<c0065e10>] (handle_level_irq) from [<c0061590>] (generic_handle_irq+0x30/0x44) [ 4089.094528] [<c0061590>] (generic_handle_irq) from [<c0061af0>] (__handle_domain_irq+0x58/0xb8) [ 4089.094543] [<c0061af0>] (__handle_domain_irq) from [<c0009418>] (bcm2835_handle_irq+0x28/0x48) [ 4089.094562] [<c0009418>] (bcm2835_handle_irq) from [<c0652cfc>] (__irq_svc+0x5c/0x7c) [ 4089.094569] Exception stack(0xc0943ef8 to 0xc0943f40) [ 4089.094576] 3ee0: 00000000 00000000 [ 4089.094588] 3f00: ffffffff c0945414 c0942000 c09450a4 c09cddc6 c0945020 c09dae00 c091ea28 [ 4089.094599] 3f20: dbfff9e0 c0943f54 c0943f48 c0943f48 c001075c c0010760 60000013 ffffffff [ 4089.094617] [<c0652cfc>] (__irq_svc) from [<c0010760>] (arch_cpu_idle+0x30/0x40) [ 4089.094632] [<c0010760>] (arch_cpu_idle) from [<c0652bac>] (default_idle_call+0x34/0x48) [ 4089.094644] [<c0652bac>] (default_idle_call) from [<c0053cf8>] (do_idle+0x8c/0xec) [ 4089.094656] [<c0053cf8>] (do_idle) from [<c0053fd4>] (cpu_startup_entry+0x1c/0x20) [ 4089.094678] [<c0053fd4>] (cpu_startup_entry) from [<c064d5bc>] (rest_init+0x7c/0x9c) [ 4089.094698] [<c064d5bc>] (rest_init) from [<c08d4d48>] (start_kernel+0x358/0x3c8) [ 4089.094706] ---[ end trace dec6fc0e363ad91e ]--- [ 4089.097916] dwc2 20980000.usb: dwc2_hsotg_ep_stop_xfr: timeout GINTSTS.GOUTNAKEFF [ 4089.098036] dwc2 20980000.usb: dwc2_hsotg_ep_stop_xfr: timeout DOEPCTL.EPDisable [ 4089.100769] dwc2 20980000.usb: new device is high-speed [ 4094.162756] dwc2 20980000.usb: new device is high-speed [ 4109.326872] dwc2 20980000.usb: new device is high-speed [ 4109.500913] dwc2 20980000.usb: new device is high-speed

The last line then repeats every few seconds.

firedfly commented 5 years ago

I manually removed the g_mass_storage module and re-added it. Once it was re-added, the dashcam in the model 3 started working again. Perhaps there is something in the dmesg output I posted that can be listened for to reload the module?

EDIT: I see reports of other, non Tesla, applications of the pi having the same issue. Sounds like a newer kernel version may fix the problem. I'll try to investigate later today if I get the chance. https://github.com/raspberrypi/linux/issues/2796

firedfly commented 5 years ago

I've updated my pi to a 4.19.29 kernel by using rpi-update. I'll have sentry mode on quite a bit over the next few days and will report back if the pi keeps recording the entire time.

WARNING: rpi-update will update the pi to an unstable, testing version of the kernel. Use this at your own risk. You can find more information here: https://www.raspberrypi.org/documentation/linux/kernel/updating.md

ken830 commented 5 years ago

I tested mine with the cig-outlet workaround on this morning's commute and it was working great for over an hour.

ScrawnyB commented 5 years ago

Following all those mods from therealmarcone, I'm getting fsck errors on the mount points. I actually am doing edits to my scripts to fsck on /dev/loop0 (or loop1 when both are mounted) rather than /mnt/cam and /mnt/music. This appears to be related to the repartitioning/FS change of formatting the whole .bin file as fat32 versus partitioning then formatting it.

I noticed quite a few of these in dmesg:

[ 26.189170] FAT-fs (loop0): Volume was not properly unmounted. Some data may be corrupt. Please run fsck.

Still could be stemming from the power issues, but I'm looking to find a way to get around them more cleanly. I think the old script just did a better job of cleaning up the Tesla's unclean dismounting mess.

firedfly commented 5 years ago

My car has been in sentry mode for about 17 hours. Went out to drive the car and the dashcam was still working. This is the first time for me that the dashcam worked after a few hours of sentry mode running. So, I think the kernel update I mentioned has solved my dashcam issue.

Merckle commented 5 years ago

I'll have to try that when I get home. I know when it ran for some time it still didn't upload what was there already when it was trading.

I tried to find the dir for stored vids but didn fsee it

On Fri, Mar 22, 2019, 9:09 AM firedfly notifications@github.com wrote:

My car has been in sentry mode for about 17 hours. Went out to drive the car and the dashcam was still working. This is the first time for me that the dashcam worked after a few hours of sentry mode running. So, I think the kernel update I mentioned has solved my dashcam issue.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/cimryan/teslausb/issues/128#issuecomment-475614598, or mute the thread https://github.com/notifications/unsubscribe-auth/AXirEj9y87BRtAvcbyBfT3h6xrURoDj_ks5vZNYBgaJpZM4b_Fai .

jspv commented 5 years ago

Have tried maintaining solid power via 12v outlet and again with a portable battery pack. No change in behavior, the filesystems still disconnect from the car at some point, generally 1-6 hours later. Made it from 7:00am to noon yesterday in Sentry mode, which was a record. Just updated to 4.19.30 and enabled persistent journaling, will report back.

ScrawnyB commented 5 years ago

So my car stopped recognizing my RasPi altogether even after the fixes I was adding into it. I even hard to hard-reboot my car because both front USB ports stopped giving power! Anyways, strange note - plugging it into a USB 2.0 hub allowed it to show up when nothing else did. Has anyone tried this approach? I'm wondering if the car wants the device detected in a certain amount of time, and if it's not - it shuts off/resets power to that port entirely? Jury's still out on this one, but I'm going to see if it continues to work this way for a few days and I'll report back.

jspv commented 5 years ago

Just updated to 4.19.30 and enabled persistent journaling, will report back.

Report is positive. I’ve been runing for three days now without incident. Several uses of Sentry mode included. 4.19.30 appears to fixt this for me. Note: I’m on 2018.05.15, I have not been updated to 2018.8.3 which appears to introduce USB power issues. Will leave this open until 4.19.x goes stable.

lolento commented 5 years ago

@jspv, yes, I also had the random disconnect issue but after upgrading to dev kernel, its been working fine

I am on 8.3, not usb power issue for me.

mattster98 commented 5 years ago

Is it possible to roll back to an older kernel version to reliably know if mine is kernel-related?

AngusThermo-Pyle commented 5 years ago

Following all those mods from therealmarcone, I'm getting fsck errors on the mount points. I actually am doing edits to my scripts to fsck on /dev/loop0 (or loop1 when both are mounted) rather than /mnt/cam and /mnt/music. This appears to be related to the repartitioning/FS change of formatting the whole .bin file as fat32 versus partitioning then formatting it.

I found this as well. therealmarcone has a PR up with corrections to use /dev/loop passed to fsck and it works well on my setup and 2019.8.4 so far. With just the new partitioning, fsck would fail to run which meant that after the first dirty power off, you'd lose access to the cam/music drives on the car. Depending on your archive method and firmware version, you may also see issues with the new/old paths PR which isn't working for all configurations yet. In general, you can workaround this issue by creating dummy paths for both new and old formats to avoid errors. Remember that any fatal errors in the archiving state will leave the drives unconnected until the next reboot, so verify that archiving is running without crashing.

lapean111 commented 5 years ago

How do I apply the /dev/loop fix ? I am running 2019.8.5 and the Cam doesn't show up in car, but the folder does show up in windows.

moorecp commented 5 years ago

FWIW, I'm running @marcone's build on 2019.8.5. The only way I could keep it working after updating from 2019.5.15 was to plug in a USB cable to both ports on my Pi. I was seeing behavior like in #132, but once I did that, everything started working as expected.

jaysauls commented 5 years ago

I've got an X and a 3. Both are currently on 2019.8.5

I've been using TeslaUsb successfully since December. I made @marcone's changes to update to using a partitioned drive with FAT file system.

On the Model 3, everything's working great.

The X hasn't worked for a couple of months; I spent some time today to figure out what's wrong with the X. On the Model X, it appears that the X is cutting power to the USB port every 35 seconds or so, causing the PI to reboot every time, obviously. I've tried both my Pi's on each car. Both Pi's work fine in the Model 3, neither of them work in the Model X.

The Model X has white lights around the USB port. If I watch the light after plugging in the Pi, the light around the USB port flashes off and on at 35 seconds after I plug in the PI directly to the port.

My guess is that the MCU is interrogating the port to try to find a file system, and it's taking the PI too long to boot up and present the Music / CAM drives before the MCU gives up and powercycles the port. This appears to be new behavior in 2019.8.x I have no idea why the behavior is different between the X and the 3, but it definitely is. My X is relatively new (Aug 2018 build), so I assume it has the latest available hardware revision of the MCU.

I tried updating the pi to rpi-4.19.y using rpi-update, same failure.

It's not immediately clear what's wrong, the last messages in dmesg on each boot are:

[   44.624206] FAT-fs (loop0): Volume was not properly unmounted. Some data may be corrupt. Please run fsck.
[   47.528652] Mass Storage Function, version: 2009/09/11
[   47.528673] LUN: removable file: (no medium)
[   47.528829] LUN: removable file: /backingfiles/music_disk.bin
[   47.528924] LUN: removable file: /backingfiles/cam_disk.bin
[   47.528933] Number of LUNs=2
[   47.543804] g_mass_storage gadget: Mass Storage Gadget, version: 2009/09/11
[   47.543822] g_mass_storage gadget: g_mass_storage ready
[   47.543838] dwc2 20980000.usb: bound driver g_mass_storage
[   47.702271] dwc2 20980000.usb: new device is high-speed
[   47.735333] dwc2 20980000.usb: new address 32
[   47.768378] g_mass_storage gadget: high-speed config #1: Linux File-Backed Storage

I guess I'll give up and us the cigarette lighter trick to power the PI.

mattster98 commented 5 years ago

To test your hypothesis maybe altering the script to get to connecting the usb drive asap might work? It seems to connect to the archive first before connecting usb which takes several seconds.

On Sat, Apr 6, 2019, 9:41 PM Jay Sauls notifications@github.com wrote:

I've got an X and a 3. Both are currently on 2019.8.5

I've been using TeslaUsb successfully since December. I made @marcone https://github.com/marcone's changes to update to using a partitioned drive with FAT file system.

On the Model 3, everything's working great.

The X hasn't worked for a couple of months; I spent some time today to figure out what's wrong with the X. On the Model X, it appears that the X is cutting power to the USB port every 35 seconds or so, causing the PI to reboot every time, obviously. I've tried both my Pi's on each car. Both Pi's work fine in the Model 3, neither of them work in the Model X.

The Model X has white lights around the USB port. If I watch the light after plugging in the Pi, the light around the USB port flashes off and on at 35 seconds after I plug in the PI directly to the port.

My guess is that the MCU is interrogating the port to try to find a file system, and it's taking the PI too long to boot up and present the Music / CAM drives before the MCU gives up and powercycles the port. This appears to be new behavior in 2019.8.x I have no idea why the behavior is different between the X and the 3, but it definitely is. My X is relatively new (Aug 2018 build), so I assume it has the latest available hardware revision of the MCU.

I tried updating the pi to rpi-4.19.y using rpi-update, same failure.

It's not immediately clear what's wrong, the last messages in dmesg on each boot are:

[ 44.624206] FAT-fs (loop0): Volume was not properly unmounted. Some data may be corrupt. Please run fsck. [ 47.528652] Mass Storage Function, version: 2009/09/11 [ 47.528673] LUN: removable file: (no medium) [ 47.528829] LUN: removable file: /backingfiles/music_disk.bin [ 47.528924] LUN: removable file: /backingfiles/cam_disk.bin [ 47.528933] Number of LUNs=2 [ 47.543804] g_mass_storage gadget: Mass Storage Gadget, version: 2009/09/11 [ 47.543822] g_mass_storage gadget: g_mass_storage ready [ 47.543838] dwc2 20980000.usb: bound driver g_mass_storage [ 47.702271] dwc2 20980000.usb: new device is high-speed [ 47.735333] dwc2 20980000.usb: new address 32 [ 47.768378] g_mass_storage gadget: high-speed config #1: Linux File-Backed Storage

I guess I'll give up and us the cigarette lighter trick to power the PI.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/cimryan/teslausb/issues/128#issuecomment-480551352, or mute the thread https://github.com/notifications/unsubscribe-auth/ABZr8rkFOWKXvAWcFVQZCb1n2e4qurroks5veUyugaJpZM4b_Fai .

jaysauls commented 5 years ago

Just tried that. Didn't seem to help.

I modified the main body of archiveloop script to:


export -f mount_mountpoint
export -f ensure_mountpoint_is_mounted
export -f retry
export -f ensure_mountpoint_is_mounted_with_retry
export -f log

log "Starting..."
mount_and_fix_errors_in_files
connect_usb_drives_to_host
ScrawnyB commented 5 years ago

@jaysauls the USB hub trick (connect RasPi to one, so the power stays powered) seems to work around the issue you are mentioning. Feel free to confirm this in your X. The behavior you are describing really doesn't sound any different from that which I was seeing in my 3.

lapean111 commented 5 years ago

Can someone tell me exactly where these scrips are located so I can take a look at these edits? Sorry, complete noob.

marcone commented 5 years ago

@lapean111 They're /in /root/bin/

lapean111 commented 5 years ago

I have now been stable for 2 days. I am using a cig port adaptor to power the pi, and a data only USB cable I made by covering the 5volt lead with tape. Annoying to have 2 cables, but it has been 100% stable for 2 days.

schmug commented 5 years ago

I've been just dealing with disconnects almost every time I drive my car but I've decided it is time to try and fix it. Should I start from scratch? I added the responsive file manager so I could easily run the reconnect but now it isn't reconnecting. I'm worried I have SD corruption. Is there a start to finish?

*** Backing up modules 4.14.98+ ############################################################# WARNING: This update bumps to rpi-4.19.y linux tree Be aware there could be compatibility issues with some drivers Discussion here: https://www.raspberrypi.org/forums/viewtopic.php?f=29&t=224931 ############################################################## Would you like to proceed? (y/N) *** Downloading specific firmware revision (this will take a few minutes) % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 168 0 168 0 0 436 0 --:--:-- --:--:-- --:--:-- 437 100 58.5M 100 58.5M 0 0 1357k 0 0:00:44 0:00:44 --:--:-- 1299k *** Updating firmware rm: cannot remove '/boot/start_cd.elf': Read-only file system rm: cannot remove '/boot/start_db.elf': Read-only file system rm: cannot remove '/boot/start.elf': Read-only file system rm: cannot remove '/boot/start_x.elf': Read-only file system

marcone commented 5 years ago

@schmug looks like you forgot to remount boot and root as read-write. Run /root/bin/remountfs_rw before running rpi-update (or do it manually using mount -o remount,rw /boot)

schmug commented 5 years ago

@schmug looks like you forgot to remount boot and root as read-write. Run /root/bin/remountfs_rw before running rpi-update (or do it manually using mount -o remount,rw /boot)

Wow... and I swore I had done that lol. thanks

firedfly commented 5 years ago

I just got an update to 2019.16.2 and noticed the TeslaCam raspberry pi is disconnecting again. If I power cycle the raspberry pi, the dashcam will start recording again (for may be 10 minutes).

Is anyone else seeing these issues in 2019.16.2?

cgalpin commented 5 years ago

I manually removed the g_mass_storage module and re-added it. Once it was re-added, the dashcam in the model 3 started working again. Perhaps there is something in the dmesg output I posted that can be listened for to reload the module?

EDIT: I see reports of other, non Tesla, applications of the pi having the same issue. Sounds like a newer kernel version may fix the problem. I'll try to investigate later today if I get the chance. raspberrypi/linux#2796

FWIW, it was being used in a tesla too :)