pcengines / apu2-documentation

Documentation and scripts for building and adjusting PC Engines APU2 firmware
https://pcengines.github.io/apu2-documentation/
208 stars 45 forks source link

Quectel EC25-AU stopping with error msg 'device descriptor read/64, error -110’ on linux distro (Openwrt 19.07). #232

Open tduque opened 3 years ago

tduque commented 3 years ago

Hey guys,

After several hours of up-time, I'm having the EC25-AU stopping to respond to any request, even after perform a reboot of APU. It doesn't respond to PCI power state changing ( i.e. changing it from D0 state to D3 and back to D0 ), or any kind of hot/warn reset I have tested could bring the EC25-AU back to live. The only way, so far, to recover it is by performing a full power off, of the system, and then doing a cold boot. I also tried to set Linux's command line reboot=c or reboot=h to try a cold reboot, but no success either.

This behavior I have only seem when this module operates for long periods without any LTE signal, on applications were it has constant activity it keeps working for several days. I'm suspecting it's getting into some unexpected state after some autosuspend or something like that, but I don't know how to confirm or deny this hypothesis, because I also tried to set autosuspend to -1, according to Linux's kernel documentation, it should disable autosuspend, but the issue still happening. It may be that I haven't set all necessary configs to prevent it to entering on suspend state, or even it isn't the main cause of this issue.

I'm having this issue on APU2 and APU3 boards, with different versions of bios, and I have no Idea how to solve it. Testing this same module using a WE826-T2 router, from Zhibotong, I don't have this issue.

I hope someone could help me with some direction about how to solve this issue, bellow is a link with some other informations I have posted on Quectel's forum.

https://forums.quectel.com/t/ec25-au-struggling-with-device-descriptor-read-64-error-110-on-linux-distro-openwrt-19-07/6444

Best regards, Tiago Duque.

pietrushnic commented 3 years ago

@miczyg1 can you comment on that?

miczyg1 commented 3 years ago

Hey @tduque ,

We will look into it, however without the EC25-AU module, it will be difficult to assist. We have already faced weird issues with LTE modems on various platforms, so probably not unique problem.

Could you possibly paste any error logs you encounter, please? I see some message in the issue topic which could be helpful.

Thanks.

michaelsteinmann commented 3 years ago

Hello,

The miniPCIe slots are powered by V3A which is always on as long as power is supplied to the board. To me it seems the firmware in your EC25-AU hangs and should be updated if possible.

Not sure if this of any help: https://osmocom.org/projects/quectel-modems/wiki/EC25_QFlash https://osmocom.org/projects/quectel-modems/files

https://osmocom.org/projects/quectel-modems/wiki/EC25_Linux Note the files at the end of the page: ec25-firmware.tar.bz2 ec25-firmware.tar.bz2 29.6 MB archive of all files in / (except sys/dev/proc/firmware) laforge, 10/06/2016 07:05 PM ec25-rootfs.tar.bz2 ec25-rootfs.tar.bz2 25.8 MB archive of /firmware laforge, 10/06/2016 07:06 PM

Best regards, Michael Steinmann

Am Do., 3. Dez. 2020 um 11:32 Uhr schrieb Michał Żygowski < notifications@github.com>:

Hey @tduque https://github.com/tduque ,

We will look into it, however without the EC25-AU module, it will be difficult to assist. We have already faced weird issues with LTE modems on various platforms, so probably not unique problem.

Could you possibly paste any error logs you encounter, please? I see some message in the issue topic which could be helpful.

Thanks.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/pcengines/coreboot/issues/450#issuecomment-737853671, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEKSM7ZOONVYWVR6LL7KILLSS5SK5ANCNFSM4UCNJJDQ .

tduque commented 3 years ago

Hey @tduque ,

We will look into it, however without the EC25-AU module, it will be difficult to assist. We have already faced weird issues with LTE modems on various platforms, so probably not unique problem.

Could you possibly paste any error logs you encounter, please? I see some message in the issue topic which could be helpful.

Thanks.

For some reason, I have posted a wrong link for the issue I have posted on Quectel's forum. please take a look into this topic, wich has some logs on it.

https://forums.quectel.com/t/ec25-au-struggling-with-device-descriptor-read-64-error-110-on-linux-distro-openwrt-19-07/6444

tduque commented 3 years ago

Hello, The miniPCIe slots are powered by V3A which is always on as long as power is supplied to the board. To me it seems the firmware in your EC25-AU hangs and should be updated if possible. Not sure if this of any help: https://osmocom.org/projects/quectel-modems/wiki/EC25_QFlash https://osmocom.org/projects/quectel-modems/files https://osmocom.org/projects/quectel-modems/wiki/EC25_Linux Note the files at the end of the page: ec25-firmware.tar.bz2 ec25-firmware.tar.bz2 29.6 MB archive of all files in / (except sys/dev/proc/firmware) laforge, 10/06/2016 07:05 PM ec25-rootfs.tar.bz2 ec25-rootfs.tar.bz2 25.8 MB archive of /firmware laforge, 10/06/2016 07:06 PM Best regards, Michael Steinmann Am Do., 3. Dez. 2020 um 11:32 Uhr schrieb Michał Żygowski < notifications@github.com>: Hey @tduque https://github.com/tduque , We will look into it, however without the EC25-AU module, it will be difficult to assist. We have already faced weird issues with LTE modems on various platforms, so probably not unique problem. Could you possibly paste any error logs you encounter, please? I see some message in the issue topic which could be helpful. Thanks. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#450 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEKSM7ZOONVYWVR6LL7KILLSS5SK5ANCNFSM4UCNJJDQ .

I will give it a try. Thanks for the information.

tduque commented 3 years ago

On the past days, I found out how to perform a cold reset on APU, as described at this documentation. Thus, after executing the cmd:

echo -ne "\xe" | dd of=/dev/port bs=1 count=1 seek=$((0xcf9))

The Quectel module could recover from the freeze state. It's such a hardcore solution, to reboot the entire system with a cold reset to have it back online.

It would be great if I could perform the same cold reset only on the mPCIe device ( I tried it using the kernel user API features ( kernel documentation ), or using the setpci function to write into the PCI bus, but nome of then could bring the Quectel's module back to an operational state. It would be even better if it doesn't freezes at all, but at least, now I can recover it using APU cold reset.

metux commented 1 year ago

It would be great if I could perform the same cold reset only on the mPCIe device ( I tried it using the kernel user API features ( kernel documentation ), or using the setpci function to write into the PCI bus, but nome of then could bring the Quectel's module back to an operational state.

These are probably too high level for your case. You can issue a link reset, but doesn't mean the actual device is reset, just the pci channel.

Depending on exact board model and revision (no runtime data on actual hw revision, you have to try it out) you could try the RST gpio lines, which I've added to pcengines-apu2 driver for exactly that reason. These are supposed to reset the module in the slot (that's why they aren't linked to pci subsys - we don't have an corresponding abstraction for slots yet).

Unfortunately, it's hard to find robust and well implemented basebands (that are affordable). Maybe you just play around with the RST lines and report back.