the-modem-distro / pinephone_modem_sdk

Pinephone Modem SDK: Tools to build your own bootloader, kernel and rootfs
GNU General Public License v3.0
595 stars 64 forks source link

Pinephone modem disappears, needs to be restarted manually #90

Open tiol11 opened 2 years ago

tiol11 commented 2 years ago

Hi and thank you for your big job done so far!

In the last days I have been experiencing random modem disappearing on my Pinephone 3GB with Mobian Bookworm and 0.6.1 firmware. I press the power button, phone gets back from suspending, shows the Phosh lockscreen, but there is no network strength indicator next to the battery. Also in the Phosh top bar, network strength indicator is missing, missing data connection is indicated.

I have to sudo systemctl restart eg25-manager to restart the modem (as explained in the 0.6.1 release changelog) and then wait 10-20 seconds for the modem to come back. This is slightly annoying because I am supposed not to check PP lockscreen often, in order to save battery life, but this way I am not able to know how long the modem stayed unconnected.

It happened today and I tried to collect relevant logs with the script. Not sure if collected logs are useful, as I got a few errors.

$ sudo ./pinephone_modem_collect_logs.sh error: no devices/emulators found error: no devices/emulators found adb: error: failed to get feature set: no devices/emulators found

dmesg-modem.txt is empty, openqti.log is missing, please find the others attached networkmanager.log modemmanager.log eg25-manager.log dmesg-pinephone.txt

vmaurin commented 2 years ago

The power settings here could help https://github.com/Biktorgj/pinephone_modem_sdk/blob/honister/docs/SETTINGS.md#pinephone

tiol11 commented 2 years ago

@vmaurin sorry I forgot to specify, my /usr/lib/udev/rules.d/80-modem-eg25.rules was as suggested in the recommended settings

Biktorgj commented 2 years ago

The modem crashed at the 3468 second mark:

[ 3468.240036] option 2-1:1.0: GSM modem (1-port) converter detected
[ 3468.247101] usb 2-1: GSM modem (1-port) converter now attached to ttyUSB0
[ 3470.850050] option1 ttyUSB0: GSM modem (1-port) converter now disconnected from ttyUSB0
[ 3470.860217] option 2-1:1.0: device disconnected
[ 3470.868347] option 2-1:1.0: GSM modem (1-port) converter detected
[ 3470.876079] usb 2-1: GSM modem (1-port) converter now attached to ttyUSB0

If you resume the phone and the modem is gone, and you see that there's only /dev/ttyUSB0 instead of the ususal 4 (ttyUSB0-3), there's a good chance that the modem kernel had a kernel panic, so it makes sense that adb doesn't work either...

I'm finishing some things to get 0.6.4 release which will bring, among other things, persistent storage support, which will come handy to debug stuff like this, but in the meantime, maybe you can move to move to 0.6.3, even if marked as pre-release, it fixes quite a lot of stuff and might help here to determine if whatever bug you hit was already fixed

EDIT: Please try with latest (https://github.com/Biktorgj/pinephone_modem_sdk/releases/tag/0.6.4) and let me know if it still happens

tiol11 commented 2 years ago

Thank you! Installed today the 0.6.4 version, I will keep you updated.

For debugging purposes, should I enable any of the following enable tracking enable persistent logging Would any of these help you to debug in case the modem disappears again?

tiol11 commented 2 years ago

Sorry today it crashed again.

$ sudo ./pinephone_modem_collect_logs.sh * daemon not running; starting now at tcp:5037 * daemon started successfully error: no devices/emulators found error: no devices/emulators found adb: error: failed to get feature set: no devices/emulators found

Dmesg-modem.txt is empty , openqti.log missing. networkmanager (1).log modemmanager (1).log eg25-manager (1).log dmesg-pinephone (1).txt

Maybe a silly question: any chance that forcing the Pinephone to suspend with sudo systemctl suspend interferes with the modem? I was out for a 3h bike ride with half battery left, then I suspended it to save some charge.

Biktorgj commented 2 years ago

Depends, if you do it just when eg25-manager was uploading AGPS data for example, it might cause some problems, but otherwise it should be safe. We're going to need at least openqti.log, so please enable persistent logging (and ADB to be able to pull the file), then when it crashes, after rebooting, you can get the previous logfile by running (as root) adb pull /persist/openqti.log.1

Please inspect the file before uploading in case there's some phone numbers or personal information!

tiol11 commented 2 years ago

Sorry I have probably misunderstood your instructions: after rebooting the modem after a crash, I get

$ sudo adb pull /persist/openqti.log.1 [sudo] password di marco: * daemon not running; starting now at tcp:5037 * daemon started successfully adb: error: failed to get feature set: no devices/emulators found

Even by killing and re-starting adb, the "failed to get feature set: no devices/emulators found" error remains...

Biktorgj commented 2 years ago

Did you enable ADB? It is disabled by default :) As root, run this: echo -ne "AT+ADBON\r\n" > /dev/ttyUSB2 The modem will disappear for a few seconds, then reconnect. Also, make sure you either add yourself to the dialout group if you want to start ADB from your user, or, the first time you're going to run an adb command, do it as root, otherwise it'll tell you you don't have enough permissions

tiol11 commented 2 years ago

Ok, thank you for the explanation. By the way, it did not work with echo -ne "AT+ADBON\r\n" > /dev/ttyUSB2, got something like /dev/ttyUSB2 resource busy. But I think it worked with sudo mmcli -m any --command='AT+ADBON', taken from AT_INTERFACE.md .

Not sure if it's relevant, but Pinephone was showing no network signal in a probably well shielded place, then half an hour later found no modem at all. If it happens again, I will also include logs, in the remote case we are facing different issues...

Openqti.log got extracted with adb command you wrote, then the other logs with usual script whos openqti.log was almost empty. openqti.log networkmanager.log modemmanager.log eg25-manager.log dmesg-pinephone.txt dmesg-modem.txt

tiol11 commented 2 years ago

Today modem disappeared 3 times with version 0.6.6-b0. I have seen no overheating SMS, then I am quite confident mine is not an overheating case. 01_networkmanager.log 01_modemmanager.log 01_eg25-manager.log 01_openqti.log 01_dmesg-pinephone.txt 01_dmesg-modem.txt

02_networkmanager.log 02_modemmanager.log 02_eg25-manager.log 02_openqti.log 02_dmesg-pinephone.txt 02_dmesg-modem.txt

03_networkmanager.log 03_modemmanager.log 03_eg25-manager.log 03_openqti.log 03_dmesg-pinephone.txt 03_dmesg-modem.txt

At third time, restarting eg25-manager service took ca.5minutes, then phone stopped answering to the power button: or it shut itself off, or it lost the ability to turn on the screen. I had to force it to reset, then probably the third log set could be useless.

tiol11 commented 2 years ago

Again today... it happens something like 2-3 times a week, I do not always remember to collect and upload logs dmesg-modem.txt dmesg-pinephone.txt eg25-manager.log openqti.log modemmanager.log networkmanager.log

bakerk98 commented 2 years ago

For me this was happening very frequently after extended periods of inactivity in the phone. Interestingly, it didn't happen when using postmarketos Phosh, but DID happen when using postmarketos SXMO.

ANYWAY, I made a script here that may be useful as a temporary patch. It uses SXMO's wake up and cron feature, so SXMO may be a dependency, but https://github.com/bakerk98/eg25-modem-cron-job its something. I am still sort of testing it I guess because I just made it, but so far it has run in the background successfully a few times, so it should work

Biktorgj commented 2 years ago

Hi @tiol11, are you still having this issue? I've found no way to replicate this so far, so I was thinking of maybe getting you some sort of one-off build to try to get better logs (especially don't loose modem kernel logs, which should show something before dying, but right now are getting lost)

tiol11 commented 2 years ago

To be honest, it is no more happening since 1/2 weeks... any chance you solved this with 0.6.7 release?

I would like to keep this under review for some more time, in the last period (similar to when the modem has no more been crashing) I have seen the entire phone rebooting or desktop environment crashing for no particular reason. Just wondering if those are masking the modem crashing as I have to completely reboot the phone...

tiol11 commented 2 years ago

Modem happened to disappear again this morning. networkmanager.log modemmanager.log eg25-manager.log openqti.log dmesg-pinephone.txt dmesg-modem.txt

Now the frequency of this issue has decreased from 2-3 times a week to once-twice a month. As I wrote, I do not know if this is related to new 0.6.7 FW release or to anything else changed in the meanwhile in the network-manager/modemmanager/eg25-manager/modem FW chain...

If you are fine with it and if attached logs give any clue, I would like to keep this issue open, as an opportunity to increase FW stability. I am also willing to install "debug build", if that's useful...

Biktorgj commented 2 years ago

If you are fine with it and if attached logs give any clue, I would like to keep this issue open, as an opportunity to increase FW stability. I am also willing to install "debug build", if that's useful...

We'll keep it open as long as it takes, it's not normal to crash like that and it doesn't seem to happen to other people (or they're not reporting it). Can't see anything in those logs though, so I'm going to have to get creative with this to get some logs that can tell us what the heck is going on

Biktorgj commented 2 years ago

Let's do a quick-n-dirty one... Attaching a boot only firmware build: 19c4c59b29ca71c841d5401dd8a1d47993e218211d0b9f8163331e354070c6e40ac879020b1f83a82afbafdecc67fe38ef840245fb217507a03709258a23359e boot-rootfs.img

With the phone freshly booted, run, as root: adb shell reboot ; fastboot oem stay ; fastboot boot boot-rootfs.img This should boot you to a build with version 0.6.8-b3 Whenever openqti starts, it will append the contents of dmesg to /persist/kernel_log

My theory is that openqti is crashing twice in a row ending up in a kernel panic. We probably don't have enough time for the filesystem to sync between the second crash and the halt, but worth a try to check if at least my assumption is correct (we should see some messages informing of devices being closed after the 30 second mark in dmesg). When the thing crashes again, just let it boot, and get the latest log with adb pull /persist/kernel_log and adb pull /persist/openqti.log.1

boot-rootfs.tar.gz

tiol11 commented 2 years ago

Finally I have something! I had to append ".log" at the end of filenames otherwise Github would not let me upload. openqti.log.1.log kernel_log.log

A bit of context: I usually get 71°C SMS from the modem when I commute back in the afternoon. I connect the PP to charger to avoid it sleeping while it plays podcasts through Bluetooth to the car audio system. Today I did not get those messages, but you can see around 37000 timestamp the temperature increasing, when I get out of office and into my car. Commuting back home takes 30-45' but openqti.log stopped something like 15 minutes later, while other PP peripherals (e.g BT) continued their task without issues.

First time I tried adb-pulling got an error like "insufficient permission for device", "is user in plugdev group?", "check udev rules"... then I gave adb kill-server and sudo adb start-server as suggested e.g. here and then adb-pulling worked. Not sure if this affects the extracted logs...

Please let me know if logs give any clue or if we should try different approach... thank you!

tiol11 commented 2 years ago

Quite strange, nothing for 3 weeks and then twice in the same day... openqti.log.1.log kernel_log.log

dasoe commented 1 year ago

Hello Biktorgj,

First of all: Thanks a lot for your awesome job. With your software the Pinephone got ready to be used for me. And it's done increadibly well (IMHO it is actually even well-documented, which is important for people like me, not being deep into the topic. Thanks for that, too).

So just for the record: I have the same problem: Modem still disapperars sometimes, hard to predict. Power settings (/usr/lib/udev/rules.d/80-modem-eg25.rules) are changed according to your suggestion, which already improved the situation. I am on manjaro phosh, installed a year ago but updated regularly. I read the above posts, updated your Software from 0.65 to 0.68 yesterday and will report back.

As I understood, ./pinephone_modem_collect_logs.sh will bring logs, you need to check what's happening? I used SMS Interface to send enable persistent logging. One question remains: will I have to enable ADB once or again after restart?

have a great day! oe

tiol11 commented 1 year ago

Today I was able to collect some more logs. It happened some more times that eg25-manger deserved a restart, but I did not have the collect-logs script at hand... openqti-20220927_192339.log.1.log openqti-20220927_192339.log networkmanager-20220927_192339.log.1.log networkmanager-20220927_192339.log modemmanager-20220927_192339.log.1.log modemmanager-20220927_192339.log eg25-manager-20220927_192339.log.1.log eg25-manager-20220927_192339.log dmesg-pinephone-20220927_192339.txt dmesg-modem-20220927_192339.txt

My personal feeling is that this behaviour is someway related to bringing the modem in places with poor cellular connection or where it struggles attaching to cell tower, like in middle of mountains or countryside.

PsychoGame commented 1 year ago

@tiol11, I can confirm that your feeling is totally right My profession is marine engineer on board of a seagoing vessel, so while at sea my modem is unable to connect to any cell towers. This also renders the modem totally disappearing on some occasions for me A restart of eg25-manager re-enables the modem for me as well, While at home I have stable cellular reception, so this behavior I do not observe when I'm at home. So indeed it seems that the modem will drop out eventually when it's in places with really bad to none reception. Can't really think of any reason why the modem would behave this way though, maybe it's programmed somewhere that it only does so many retries, and then just stays dormant or something like that.

tiol11 commented 1 year ago

openqti-20220928_184731.log.1.log openqti-20220928_184731.log networkmanager-20220928_184731.log.1.log networkmanager-20220928_184731.log modemmanager-20220928_184731.log.1.log modemmanager-20220928_184731.log eg25-manager-20220928_184731.log.1.log eg25-manager-20220928_184731.log dmesg-pinephone-20220928_184731.txt dmesg-modem-20220928_184731.txt

Once I forgot shutting down the modem while biking in the mountains, the PP battery was empty after half a day of idle phone. I did the same some days later but stopping eg25-manager, modemmanager and networkmanager and the battery was like 90% at lunch break. It really looks like the actual modem behaviour is either it drains the whole battery trying to connect to cellular network, either it disables itself...

Biktorgj commented 1 year ago

Something is dying. At this point I'm not even sure it's my fault, since in all the logs @tiol11 has managed to provide, there's absolutely no sign of a crash from Openqti, which is the only thing that could cause that from the userspace.

Faster battery draining from not having network is expected, since the modem will be looking and trying to register to the network all the time (the same as if you do a big trip and it's constantly switching towers -I'd loose about half the battery in 600km in my OP8T- just from that).

The issue is how to actually replicate that to check. The only thing that I could try is to disconnect the antennas from the phone and keep it powered on for hours to see if I can replicate it.

@dasoe to answer your questions: Once enabled, ADB will be automatically started until you manually disable it. Note that enable persistent logging doesn't enable ADB automatically, you need to do that separately.

@tiol11 In all this time, have you tried with different ADSP versions? We've always assumed it was openqti at fault, but if it's not and for whatever reason version .003 has some lingering bug somewhere, maybe trying .006 while I do the no-antenna thing in my Pinephone and keep it like that for a few days could be a good experiment too

PsychoGame commented 1 year ago

@Biktorgj, i can confirm this problem at least exists on the 30.006.30.006 ADSP firmware, that's what I'm currently using. I haven't tried switching firmware to triage if this problem excists on the other firmwares as well. I did update the ADSP each time newer files were available, but since this bug report I started paying more attention to the fact that this might have to do with bad reception or not. Which it obviously looks like, as I experience more frequent crashes of the modem on board than I do at home, where I have next to none crashes of the modem to be honest.

tiol11 commented 1 year ago

As far as I remember, I gave ADSP 30.006 a try in the past, but cannot remember the reason I came back to 01.003.01.003. I will give 30.006 another try, then maybe I will try also the other two ADSP versions in your repo...

tiol11 commented 1 year ago

Gave both 30.006 and 01.002 a try: I had to restart the modem at least once with both. Maybe I will try 30.004, too, but I do not see the reason for Quectel developers to fix a bug and later re-introduce it in a later release...

mutlusun commented 1 year ago

Hello, I'm also affected by this issue with different versions of the open firmware and different variants of ADSP versions. When my modem disappears, other people tell me that they can call me and get a normal ring tone (like when the modem is working but nobody answers). Maybe this observation helps to locate the issue further?

Thank you all for your work and you effort!

tiol11 commented 1 year ago

On latest releases, it is possible to have cell towers tracked: I'm wondering if this could help going on with our analysis.

I had to drive a long trip, then I gave tracking a try. PP was connected via Bluetooth to the car's audio system, so I could easily check network status: I noticed network was lost at 19.06 (maybe 19.05 and I realized it one minute late). Do these cells information in the screenshot help in some way? (modem hung somewhere ca.50km south of Verona, northern Italy. I'm aware that cell info can disclose my position, but this case is not my home location, I was just driving by...)

Screenshot_20230318_205226

bakerk98 commented 1 year ago

I'm also starting to think it may be due to signal to the modem, as I have observed the same patterns of great stability at home, and instability as I move around during the day