Closed joaofl closed 8 years ago
Sometimes I get what seems to be some bad blocks:
[ 5004.907560] [c4] sd 0:0:0:0: [sda] Unhandled sense code
[ 5004.907581] [c4] sd 0:0:0:0: [sda]
[ 5004.907594] Result: hostbyte=0x00 driverbyte=0x08
[ 5004.907610] [c4] sd 0:0:0:0: [sda]
[ 5004.907622] Sense Key : 0x3 [current]
[ 5004.907650] [c4] sd 0:0:0:0: [sda]
[ 5004.907661] ASC=0x11 ASCQ=0x0
[ 5004.907681] [c4] sd 0:0:0:0: [sda] CDB:
[ 5004.907692] cdb[0]=0x28: 28 00 0f 7d 00 00 00 00 80 00
[ 5004.907784] [c4] end_request: critical target error, dev sda, sector 259850240
[ 5152.019254] [c0] usb 4-1.2: reset SuperSpeed USB device number 3 using xhci-hcd
@joaofl
whenever it gets stressed, sporadically resets, together with an audible "tick"
Sounds like power requirements for the USB drive are not being met at load:
They claim to have fixed the issue on kernel version 3.17. I wonder if there is a fix for that, or means to upgrade the kernel to get that fixed. I'm afraid I'll lose data one of these days.
I'am compiling the 4.7 tobetter kernel now: http://odroid.com/dokuwiki/doku.php?id=en:xu4_building_kernel https://github.com/tobetter/linux/tree/odroidxu4-v4.7.
As you mentioned, I've been meaning to test 4.x for improved network throughput (30mb/s currently): https://github.com/Fourdee/DietPi/issues/414. So hopefully, 2 birds, 1 stone :+1:
Will post download link when its ready.
Shes one hot board lol:
@joaofl I've compiled and hosted the tobetter 4.7.0 kernel (instructions https://github.com/Fourdee/DietPi/issues/414#issuecomment-243736813). Looks good here
@Fourdee thanks for the responsive and efficient support.
Whats the amp rating of the drive (eg: 750ma)?
from "lsusb -v" I get MaxPower 224mA
2.5 inch or 3.5 inch external drive?
2.5, host powered, with the original 3A power supply from hardkernel. From some forums, I saw people able to power their external hd from the host usb with no problem. There are some issues related to dirty on the USB3 pins on the connector, but I have cleaned them to make sure. I tough it was current limitation in the beginning, but even the raspberry with 1,2A limit managed to power it up ok. But anyway, I can perform some more tests with a 5A power supply and see.
Or do you think this error is clearly lack of power? Anyway, it happen less often then the other reset warning. [ 5004.907622] Sense Key : 0x3 [current] [ 5004.907650] [c4] sd 0:0:0:0: [sda]
Other I can upgrade to the kernel 4.7.0 and check if it happens.
Ill keep this post up to date.
Thanks again :+1:
@joaofl
from "lsusb -v" I get MaxPower 224mA
I just tested my 750mA drive powered down, idle, then with a file copy. I get same results each time:
root@DietPi-XU4:~# lsusb -v | grep MaxPower MaxPower 0mA
MaxPower 180mA
MaxPower 180mA
MaxPower 0mA
MaxPower 24mA
MaxPower 0mA
MaxPower 0mA
MaxPower 100mA
MaxPower 0mA
MaxPower 0mA
MaxPower 0mA
https://www.amazon.co.uk/Elements-Portable-External-Drive-WDBU6Y0020BBK-EESN/dp/B00D0L5BH8 They claim its USB2.0 compatible so its <=500mA, but i'd imagine its higher than 224mA when active.
original 3A power supply from hardkernel
The current HK PSU is 5v/4amp? I had unstable issues with that PSU and had to change it to 5v/8amp, now runs 24/7 for over 6 months fine.
Or do you think this error is clearly lack of power?
I'am pretty confidant this is a insufficient or unstable power issue for PSU. But lets see if the 4.7 kernel helps.
In general the if hard drive is physically failing it can pull more power causing symptoms as described. Most likely it is insufficient power supply feeding the board as already stated.
You guys were right, it was caused by lack of current. But what is strange is the fact that it was working smoothly for almost one year now... It may be some aging issues.
I had one of this 10A power supply (cheap but good enough) hanging here, and decided to test that.
It has a small knob for minor voltage adjustments. I initially set it at 5.00V but the hd had trouble to spin up. 5 sharp according to my cheap multimeter (+ - error). Set it to 5.2V, hd worked fine, with no errors for almost one hour, but the XU4 suddenly shuts down, due to overheat. Something that never happened. Dropped down to around 5.1V, noticed it running a bit cooler, but still, limited cpu clock to 1600MHz.
It is running now for 12h with no problems at all. So, I would say the problem has been solved, although @Fourdee , I will perform those tests with the new kernel as well.
Thanks guys
@Fourdee After some long time, I believe I have some bad news regarding this.
Few weeks ago I finally tried to migrate to kernel 4.14 (from my actual setup running kernel 3.xxx) using the exact same hardware setup, with the latest dietpi. The setup Is:
SD card with the boot SSD USB3 powered adapter with the filesystem, for better performance than the SD card. HD on USB3, externally powered (where all the junk is)
So it started well. First stress tests were all ok. That is, some intense IO on the HD simultaneously with some network transfer. Something like copying a big file from and to the HD from another computer via gigabit network.
Then, first thing, before installing apps, I moved the filesystems to my USB3 External SSD uing the dietpi scrip, as I had done before. Then I run the same "stress test" again, and this is when the problem arise:
usb 4-1.2: reset SuperSpeed USB device number 3 using xhci-hcd
The exact same issue.. The needle ticking and this restart issue. Then I downgrade to the 3.x kernel, and the exact same tests work fine.
One way of avoiding the problem is to not migrate the filesystem to the USB. But in case you do, You may have the same issue. I tested It on my two Odroid XU4.
I also discussed this issue extensively on the odroid forums, but the folks there have their filysystem at the SD card or EMMC. They are not necessarily using these means you provide to transfer the filesystem to USB storage. I believe it is reproducible with any couple of USB3 devices.
Do you say something on that? Sorry to bother with this issue again, but now I'm really feeling behing without the latest updates. Cheers
@joaofl
The exact same issue.. The needle ticking and this restart issue. Then I downgrade to the 3.x kernel, and the exact same tests work fine.
Sounds like 4.14 isn't providing enough power to the USB port. Or the XU4 isn't getting enough power from PSU, or, external HDD's are not getting enough power, or, 4.14 is incompatible with the USB controller on the external HDD caddy. Nightmare basically lol.
Ok, even though it works on 3.x, the XU4 needs at least 5V/6A PSU in general, especially when using external HDD's. Regardless if they are bus powered or not (even with external powered USB drives, you'd need to verify if the bus is providing the actual power).
A quick and costly solution, would be to purchase the HC1/2 which has much better support for the attached SATA drives. I've been running my HC1 for months with a 64GB SATA SSD and rootFS transferred to it.
Aside from that, it literally could be anything (kernel/hardware), and a long road to debug.
@Fourdee I really believe it is not power related, as I have tested every cable/power supply combination possible. Believe me, I have loads of them. Even with a computer power supply, with some 10A+ available.
Moreover, if the filesystem is at the SD, and I run the stress test with kernel 4.14, and IO all SSD , HD and network intensively simultaneously, I get no errors. The issue only arises after the filesystem gets copied to the SSD. Reason why I believe it is some kernel issue. Hard to tell.
ps: I was tempted to buy the HC1/2, but since they started the development of that even cooler board, I decided to hold. You are more lucky than I, since you get those for free :)
I posted this on the odroid forums as well. Lets see if anybody else can reproduce it.
@joaofl
You are more lucky than I, since you get those for free
😄 + import tax £30 😉
I'm use a XU4 with USB3, connected to a WD Elements of 2TB external HD. It happens that, whenever it gets stressed, it sporadically resets, together with an audible "tick" coming from it (what causes bad blocks, low life span, and consequently data loss).
Searching around, the best clue I found was this: http://forums.debian.net/viewtopic.php?f=7&t=117061 due to a bug with the driver of the converter Sata -> USB3 from the hd. https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=a9c54caa456dccba938005f6479892b589975e6a
They claim to have fixed the issue on kernel version 3.17. I wonder if there is a fix for that, or means to upgrade the kernel to get that fixed. I'm afraid I'll lose data one of these days.
Maybe this is one reason more for: https://github.com/Fourdee/DietPi/issues/414
Thanks
Below is the error.