meshtastic / firmware

Meshtastic device firmware
https://meshtastic.org
GNU General Public License v3.0
3.35k stars 816 forks source link

GPS: short update intervals, lock-time prediction #4070

Closed todd-herbert closed 3 months ago

todd-herbert commented 3 months ago

Addresses #4067

Currently, a position.gps_update_interval of 10 seconds or less is not correctly handled. This PR refactors the GPSPowerState enum to mark this case, and prevent back-to-back position updates occurring.

Pretty sure this fixes it, but please do double-check my work, so we don't have round-two of this issue..

GPSFan commented 3 months ago

I basically see the same behavior as before. Greater than 10 sec sleeps for the requested time, <= 10 sec continuous rapid updates at >= 1/sec. Serial log looks identical to what I posted in the bug report. I'll have to look at the code closely, it's more complicated than I thought.

GPSFan commented 3 months ago

Not quite identical to the previous log. Notice GPS lock is 1 but no sats or position, yet I am watching the NMEA output and there are lots of sats being tracked. new code:

DEBUG | 20:55:15 8012 [GPS] GPS Lock took 8012, average 19 DEBUG | 20:55:15 8012 [GPS] publishing pos@0:2, hasVal=0, Sats=0, GPSlock=1 DEBUG | 20:55:15 8012 [GPS] onGPSChanged() pos@0, time=1718052915, lat=0, lon=0, alt=0 INFO | 20:55:15 8012 [GPS] updatePosition LOCAL pos@0, time=1718052915, latI=0, lonI=0, alt=0 DEBUG | 20:55:15 8012 [GPS] Setting local position: latitude=0, longitude=0, time=1718052915, timestamp=0 DEBUG | 20:55:15 8012 [GPS] Node status update: 2 online, 4 total DEBUG | 20:55:16 8012 [GPS] WANT GPS=0 DEBUG | 20:55:16 8012 [GPS] GPS Lock took 8012, average 20 DEBUG | 20:55:16 8012 [GPS] publishing pos@0:2, hasVal=0, Sats=0, GPSlock=1 DEBUG | 20:55:16 8012 [GPS] onGPSChanged() pos@0, time=1718052916, lat=0, lon=0, alt=0 INFO | 20:55:16 8012 [GPS] updatePosition LOCAL pos@0, time=1718052916, latI=0, lonI=0, alt=0 DEBUG | 20:55:16 8012 [GPS] Setting local position: latitude=0, longitude=0, time=1718052916, timestamp=0 DEBUG | 20:55:16 8012 [GPS] Node status update: 2 online, 4 total DEBUG | 20:55:16 8012 [GPS] hasValidLocation RISING EDGE DEBUG | 20:55:16 8012 [GPS] hasValidLocation FALLING EDGE (last read: 1) DEBUG | 20:55:16 8012 [GPS] WANT GPS=0 DEBUG | 20:55:16 8012 [GPS] GPS Lock took 8012, average 19 DEBUG | 20:55:16 8012 [GPS] publishing pos@0:2, hasVal=0, Sats=0, GPSlock=1 DEBUG | 20:55:16 8012 [GPS] onGPSChanged() pos@0, time=1718052916, lat=0, lon=0, alt=0 INFO | 20:55:16 8012 [GPS] updatePosition LOCAL pos@0, time=1718052916, latI=0, lonI=0, alt=0 DEBUG | 20:55:16 8012 [GPS] Setting local position: latitude=0, longitude=0, time=1718052916, timestamp=0 DEBUG | 20:55:16 8012 [GPS] Node status update: 2 online, 4 total DEBUG | 20:55:16 8012 [GPS] hasValidLocation RISING EDGE DEBUG | 20:55:16 8012 [GPS] hasValidLocation FALLING EDGE (last read: 1) DEBUG | 20:55:16 8012 [GPS] WANT GPS=0 DEBUG | 20:55:16 8012 [GPS] GPS Lock took 8012, average 19 DEBUG | 20:55:16 8012 [GPS] publishing pos@0:2, hasVal=0, Sats=0, GPSlock=1 DEBUG | 20:55:16 8012 [GPS] onGPSChanged() pos@0, time=1718052916, lat=0, lon=0, alt=0 INFO | 20:55:16 8012 [GPS] updatePosition LOCAL pos@0, time=1718052916, latI=0, lonI=0, alt=0 DEBUG | 20:55:16 8012 [GPS] Setting local position: latitude=0, longitude=0, time=1718052916, timestamp=0 DEBUG | 20:55:16 8013 [GPS] Node status update: 2 online, 4 total DEBUG | 20:55:16 8013 [GPS] WANT GPS=0 DEBUG | 20:55:16 8013 [GPS] GPS Lock took 8013, average 19 DEBUG | 20:55:16 8013 [GPS] publishing pos@0:2, hasVal=0, Sats=0, GPSlock=1 DEBUG | 20:55:16 8013 [GPS] onGPSChanged() pos@0, time=1718052916, lat=0, lon=0, alt=0 INFO | 20:55:16 8013 [GPS] updatePosition LOCAL pos@0, time=1718052916, latI=0, lonI=0, alt=0 DEBUG | 20:55:16 8013 [GPS] Setting local position: latitude=0, longitude=0, time=1718052916, timestamp=0 DEBUG | 20:55:16 8013 [GPS] Node status update: 2 online, 4 total DEBUG | 20:55:17 8013 [GPS] WANT GPS=0 DEBUG | 20:55:17 8013 [GPS] GPS Lock took 8013, average 20 DEBUG | 20:55:17 8013 [GPS] publishing pos@0:2, hasVal=0, Sats=0, GPSlock=1 DEBUG | 20:55:17 8013 [GPS] onGPSChanged() pos@0, time=1718052917, lat=0, lon=0, alt=0 INFO | 20:55:17 8013 [GPS] updatePosition LOCAL pos@0, time=1718052917, latI=0, lonI=0, alt=0 DEBUG | 20:55:17 8013 [GPS] Setting local position: latitude=0, longitude=0, time=1718052917, timestamp=0 DEBUG | 20:55:17 8013 [GPS] Node status update: 2 online, 4 total DEBUG | 20:55:17 8013 [GPS] hasValidLocation RISING EDGE DEBUG | 20:55:17 8013 [GPS] hasValidLocation FALLING EDGE (last read: 1)

todd-herbert commented 3 months ago

Looking at it again this morning, I can see that I've now accidentally bypassed this section of code, which definitely needs some attention. I don't think this is related to what you're seeing though?


It's strange.. from that log output it looks like it's still falling through GPS::setAwake(false) somehow. I can see your log line [GPS] WANT GPS=0, which means we are identifying that the power state should change.

With this PR code, once it has hit that point, it seems like it has no choice (if else) but to call GPS::setGPSPower from Line 921 or Line 932

At the top of GPS::setGPSPower, it looks like it should always print the log line GPS::powerState=x.

I can't actually see any path the code can take which matches your log output, but I might be overlooking something. Just to eliminate it as a possibility, is there any chance you pulled master again, rather than code we're working on from this pull request?

I'm not sure why you're seeing lat=0, lon=0, alt=0 though. Could it be that the hardware "isn't happy" about being harassed constantly by the failing code? The actual technical side of GPS is a bit over my head, I'm just the GPIO monkey trying to turn the power on and off..


The significant part of that section which I bypassed yesterday was a piece of code which aims to prevent one abnormally long lock-time from upsetting the average value. I think moving that code up to Line 889 would achieve the same behavior as before https://github.com/meshtastic/firmware/commit/da5bca31edcd091b33d1e6b9b35ea6f807850eb0?

Right now, it keeps track of the "total time searching for lock" and "number of GPS updates", dividing the two. I'm nervous to suggest changing the code, and I'm not sure if it's worth looking at, but I actually wonder if we might get a quicker response by doing the same thing we do with the ADC readings. The original author of that section had called it a "virtual low pass filter".

average = (old + latest) / 2;

If possible, I think it'd be really valuable to hear from @jp-bennett about everything too. I probably should have gone straight to them for advice at the beginning; sorry about that.

GPSFan commented 3 months ago

I cloned your repo and copied your GPS.cpp and GPS.h into my local folder then rebuilt the code and downloaded it to my device. The hardware is happy, I am watching the data out of the GPS with a serial monitor, I'm getting good NMEA data, When I first started it it got a lock and was behaving normally. Here is another log that captures the transition from having position data to not. NMEA data is still good.

More new code wierdness:

INFO | 02:11:59 56 [GPS] updatePosition LOCAL pos@6667b270, time=1718071919, latI=388981961, lonI=-1075740508, alt=1914 DEBUG | 02:11:59 56 [GPS] Setting local position: latitude=388981961, longitude=-1075740508, time=1718071919, timestamp=1718071920 DEBUG | 02:11:59 56 [GPS] Node status update: 2 online, 4 total DEBUG | 02:12:00 57 [GPS] WANT GPS=0 DEBUG | 02:12:00 57 [GPS] GPS Lock took 57, average 19 DEBUG | 02:12:00 57 [GPS] publishing pos@6667b271:2, hasVal=1, Sats=12, GPSlock=1 DEBUG | 02:12:00 57 [GPS] New GPS pos@6667b271:3 lat=38.898198, lon=-107.574056, alt=1914, pdop=1.01, track=0.00, speed=0.00, sats=12 DEBUG | 02:12:00 57 [GPS] onGPSChanged() pos@6667b271, time=1718071920, lat=388981983, lon=-1075740555, alt=1914 INFO | 02:12:00 57 [GPS] updatePosition LOCAL pos@6667b271, time=1718071920, latI=388981983, lonI=-1075740555, alt=1914 DEBUG | 02:12:00 57 [GPS] Setting local position: latitude=388981983, longitude=-1075740555, time=1718071920, timestamp=1718071921 DEBUG | 02:12:00 57 [GPS] Node status update: 2 online, 4 total DEBUG | 02:12:01 57 [GPS] WANT GPS=0 DEBUG | 02:12:01 57 [GPS] GPS Lock took 57, average 20 DEBUG | 02:12:01 57 [GPS] publishing pos@6667b272:2, hasVal=1, Sats=12, GPSlock=1 DEBUG | 02:12:01 57 [GPS] New GPS pos@6667b272:3 lat=38.898199, lon=-107.574067, alt=1914, pdop=1.01, track=0.00, speed=0.00, sats=12 DEBUG | 02:12:01 57 [GPS] onGPSChanged() pos@6667b272, time=1718071921, lat=388981991, lon=-1075740670, alt=1914 INFO | 02:12:01 57 [GPS] updatePosition LOCAL pos@6667b272, time=1718071921, latI=388981991, lonI=-1075740670, alt=1914 DEBUG | 02:12:01 57 [GPS] Setting local position: latitude=388981991, longitude=-1075740670, time=1718071921, timestamp=1718071922 DEBUG | 02:12:01 57 [GPS] Node status update: 2 online, 4 total DEBUG | 02:12:01 58 [GPS] WANT GPS=0 DEBUG | 02:12:01 58 [GPS] GPS Lock took 58, average 19 DEBUG | 02:12:01 58 [GPS] publishing pos@6667b272:2, hasVal=1, Sats=12, GPSlock=1 ERROR | 02:12:01 58 [GPS] BUG: Positional timestamp unchanged from prev solution DEBUG | 02:12:01 58 [GPS] New GPS pos@6667b272:3 lat=38.898199, lon=-107.574067, alt=1913, pdop=0.98, track=0.00, speed=0.00, sats=12 DEBUG | 02:12:01 58 [GPS] onGPSChanged() pos@6667b272, time=1718071921, lat=388981991, lon=-1075740670, alt=1913 INFO | 02:12:01 58 [GPS] updatePosition LOCAL pos@6667b272, time=1718071921, latI=388981991, lonI=-1075740670, alt=1913 DEBUG | 02:12:01 58 [GPS] Setting local position: latitude=388981991, longitude=-1075740670, time=1718071921, timestamp=1718071922 DEBUG | 02:12:01 58 [GPS] Node status update: 2 online, 4 total DEBUG | 02:12:02 59 [GPS] WANT GPS=0 DEBUG | 02:12:02 59 [GPS] GPS Lock took 59, average 20 DEBUG | 02:12:02 59 [GPS] publishing pos@6667b273:2, hasVal=1, Sats=12, GPSlock=1 DEBUG | 02:12:02 59 [GPS] New GPS pos@6667b273:3 lat=38.898202, lon=-107.574072, alt=1913, pdop=0.98, track=0.00, speed=0.00, sats=12 DEBUG | 02:12:02 59 [GPS] onGPSChanged() pos@6667b273, time=1718071922, lat=388982020, lon=-1075740718, alt=1913 INFO | 02:12:02 59 [GPS] updatePosition LOCAL pos@6667b273, time=1718071922, latI=388982020, lonI=-1075740718, alt=1913 DEBUG | 02:12:02 59 [GPS] Setting local position: latitude=388982020, longitude=-1075740718, time=1718071922, timestamp=1718071923 DEBUG | 02:12:02 59 [GPS] Node status update: 2 online, 4 total DEBUG | 02:12:03 59 [GPS] WANT GPS=0 DEBUG | 02:12:03 59 [GPS] GPS Lock took 59, average 19 DEBUG | 02:12:03 59 [GPS] publishing pos@6667b274:2, hasVal=1, Sats=12, GPSlock=1 DEBUG | 02:12:03 59 [GPS] New GPS pos@6667b274:3 lat=38.898199, lon=-107.574071, alt=1913, pdop=0.98, track=0.00, speed=0.00, sats=12 DEBUG | 02:12:03 59 [GPS] onGPSChanged() pos@6667b274, time=1718071923, lat=388981986, lon=-1075740713, alt=1913 INFO | 02:12:03 59 [GPS] updatePosition LOCAL pos@6667b274, time=1718071923, latI=388981986, lonI=-1075740713, alt=1913 DEBUG | 02:12:03 59 [GPS] Setting local position: latitude=388981986, longitude=-1075740713, time=1718071923, timestamp=1718071924 DEBUG | 02:12:03 59 [GPS] Node status update: 2 online, 4 total


Look here.....

DEBUG | 02:12:03 60 [GPS] hasValidLocation FALLING EDGE (last read: 1) DEBUG | 02:12:03 60 [GPS] WANT GPS=0 DEBUG | 02:12:03 60 [GPS] GPS Lock took 60, average 20 DEBUG | 02:12:03 60 [GPS] publishing pos@0:2, hasVal=0, Sats=0, GPSlock=1 DEBUG | 02:12:03 60 [GPS] No GPS lock DEBUG | 02:12:03 60 [GPS] onGPSChanged() pos@0, time=1718071923, lat=0, lon=0, alt=0 INFO | 02:12:03 60 [GPS] updatePosition LOCAL pos@0, time=1718071923, latI=0, lonI=0, alt=0 DEBUG | 02:12:03 60 [GPS] Setting local position: latitude=0, longitude=0, time=1718071923, timestamp=0 DEBUG | 02:12:03 60 [GPS] Node status update: 2 online, 4 total DEBUG | 02:12:04 60 [GPS] WANT GPS=0 DEBUG | 02:12:04 60 [GPS] GPS Lock took 60, average 20 DEBUG | 02:12:04 60 [GPS] publishing pos@0:2, hasVal=0, Sats=0, GPSlock=1 DEBUG | 02:12:04 60 [GPS] onGPSChanged() pos@0, time=1718071924, lat=0, lon=0, alt=0 INFO | 02:12:04 60 [GPS] updatePosition LOCAL pos@0, time=1718071924, latI=0, lonI=0, alt=0 DEBUG | 02:12:04 60 [GPS] Setting local position: latitude=0, longitude=0, time=1718071924, timestamp=0 DEBUG | 02:12:04 60 [GPS] Node status update: 2 online, 4 total DEBUG | 02:12:04 60 [GPS] WANT GPS=0 DEBUG | 02:12:04 60 [GPS] GPS Lock took 60, average 19 DEBUG | 02:12:04 60 [GPS] publishing pos@0:2, hasVal=0, Sats=0, GPSlock=1 DEBUG | 02:12:04 60 [GPS] onGPSChanged() pos@0, time=1718071924, lat=0, lon=0, alt=0 INFO | 02:12:04 60 [GPS] updatePosition LOCAL pos@0, time=1718071924, latI=0, lonI=0, alt=0 DEBUG | 02:12:04 60 [GPS] Setting local position: latitude=0, longitude=0, time=1718071924, timestamp=0 DEBUG | 02:12:04 60 [GPS] Node status update: 2 online, 4 total DEBUG | 02:12:04 60 [GPS] hasValidLocation RISING EDGE DEBUG | 02:12:04 60 [GPS] hasValidLocation FALLING EDGE (last read: 1) DEBUG | 02:12:04 60 [GPS] WANT GPS=0 DEBUG | 02:12:04 60 [GPS] GPS Lock took 60, average 20 DEBUG | 02:12:04 60 [GPS] publishing pos@0:2, hasVal=0, Sats=0, GPSlock=1 DEBUG | 02:12:04 60 [GPS] onGPSChanged() pos@0, time=1718071924, lat=0, lon=0, alt=0 INFO | 02:12:04 60 [GPS] updatePosition LOCAL pos@0, time=1718071924, latI=0, lonI=0, alt=0 DEBUG | 02:12:04 60 [GPS] Setting local position: latitude=0, longitude=0, time=1718071924, timestamp=0

GPSFan commented 3 months ago

Any help from any source would be appreciated. With GPS Update >10 things look like they should at 10 and below weird things are happening.

todd-herbert commented 3 months ago

I'm just really confused why there's no GPS::powerState=x log line. We're hitting Line 903: WANT GPS=x and I can't see any code path from there that doesn't print GPS::powerState=x at Line 793.

I'll look at it again with fresh eyes in an hour or two and see if anything jumps out at me. I'd want to try adding extra log at every step in between those two, but I actually can't reproduce the fault here on my T-Beam v1.2 to test:

DEBUG | 02:51:19 393 [Button] WANT GPS=1
DEBUG | 02:51:19 393 [Button] GPS::powerState=1
DEBUG | 02:51:25 399 [GPS] WANT GPS=0
DEBUG | 02:51:25 399 [GPS] GPS Lock took 5, average 0
DEBUG | 02:51:25 399 [GPS] GPS::powerState=3
DEBUG | 02:51:25 399 [GPS] publishing pos@6667bbae:2, hasVal=1, Sats=5, GPSlock=1
DEBUG | 02:51:25 399 [GPS] New GPS pos@6667bbae:3 lat=<redacted>, lon=<redacted>, alt=21, pdop=2.43, track=0.00, speed=0.02, sats=5
DEBUG | 02:51:25 399 [GPS] onGPSChanged() pos@6667bbae, time=1718074285, lat=<redacted>, lon=<redacted>, alt=21
INFO  | 02:51:25 399 [GPS] updatePosition LOCAL pos@6667bbae, time=1718074285, latI=<redacted>, lonI=<redacted>, alt=21
DEBUG | 02:51:25 399 [GPS] Setting local position: latitude=<redacted>, longitude=<redacted>, time=1718074285, timestamp=1718074286
DEBUG | 02:51:25 399 [GPS] Node status update: 1 online, 3 total
DEBUG | 02:51:27 401 [Power] Battery: usbPower=1, isCharging=0, batMv=0, batPct=0
INFO  | 02:51:32 406 [DeviceTelemetryModule] (Sending): air_util_tx=0.051083, channel_utilization=0.000000, battery_level=0, voltage=0.000000, uptime=406
DEBUG | 02:51:32 406 [DeviceTelemetryModule] updateTelemetry LOCAL
DEBUG | 02:51:32 406 [DeviceTelemetryModule] Node status update: 1 online, 3 total
INFO  | 02:51:32 406 [DeviceTelemetryModule] Sending packet to phone
INFO  | 02:51:32 406 Telling client we have new packets 20
INFO  | 02:51:32 406 BLE notify fromNum
INFO  | 02:51:32 406 From Radio onread
INFO  | 02:51:32 406 getFromRadio=STATE_SEND_PACKETS
DEBUG | 02:51:32 406 phone downloaded packet (id=0x7d8eea84 fr=0x44 to=0xff, WantAck=0, HopLim=3 Ch=0x0 Portnum=67 rxtime=1718074292 priority=10)
DEBUG | 02:51:32 406 encoding toPhone packet to phone variant=2, 47 bytes
INFO  | 02:51:32 406 From Radio onread
DEBUG | 02:51:35 410 [GPS] WANT GPS=1
DEBUG | 02:51:35 410 [GPS] GPS::powerState=1
DEBUG | 02:51:36 410 [GPS] WANT GPS=0
DEBUG | 02:51:36 410 [GPS] GPS Lock took 0, average 0
DEBUG | 02:51:36 410 [GPS] GPS::powerState=3
DEBUG | 02:51:36 410 [GPS] publishing pos@6667bbb9:2, hasVal=1, Sats=5, GPSlock=1
DEBUG | 02:51:36 410 [GPS] New GPS pos@6667bbb9:3 lat=<redacted>, lon=<redacted>, alt=36, pdop=2.43, track=0.00, speed=0.00, sats=5
DEBUG | 02:51:36 410 [GPS] onGPSChanged() pos@6667bbb9, time=1718074296, lat=<redacted>, lon=<redacted>, alt=36
INFO  | 02:51:36 410 [GPS] updatePosition LOCAL pos@6667bbb9, time=1718074296, latI=<redacted>, lonI=<redacted>, alt=36
DEBUG | 02:51:36 410 [GPS] Setting local position: latitude=<redacted>, longitude=<redacted>, time=1718074296, timestamp=1718074297
DEBUG | 02:51:36 410 [GPS] Node status update: 1 online, 3 total
INFO  | 02:51:42 416 To Radio onwrite
DEBUG | 02:51:42 416 PACKET FROM PHONE (id=0x89737586 fr=0x00 to=0x44, WantAck=0, HopLim=3 Ch=0x0 Portnum=3 priority=10)
DEBUG | 02:51:42 416 Enqueued local (id=0x89737586 fr=0x00 to=0x44, WantAck=0, HopLim=3 Ch=0x0 Portnum=3 rxtime=1718074302 priority=10)
DEBUG | 02:51:42 416 [Router] Add packet record (id=0x89737586 fr=0x00 to=0x44, WantAck=0, HopLim=3 Ch=0x0 Portnum=3 rxtime=1718074302 priority=10)
DEBUG | 02:51:42 416 [Router] handleReceived(REMOTE) (id=0x89737586 fr=0x00 to=0x44, WantAck=0, HopLim=3 Ch=0x0 Portnum=3 rxtime=1718074302 priority=10)
DEBUG | 02:51:42 416 [Router] Module 'position' wantsPacket=1
INFO  | 02:51:42 416 [Router] Received position from=0x0, id=0x89737586, portnum=3, payloadlen=17
DEBUG | 02:51:42 416 [Router] Incoming update from MYSELF
DEBUG | 02:51:42 416 [Router] Setting local position: latitude=<redacted>, longitude=<redacted>, time=1718074303, timestamp=0
DEBUG | 02:51:42 416 [Router] POSITION node=1c498f44 l=17 latI=<redacted> lonI=<redacted> msl=67 hae=0 geo=0 pdop=0 hdop=0 vdop=0 siv=0 fxq=0 fxt=0 pts=0 time=1718074303
DEBUG | 02:51:42 416 [Router] Current RTC quality: GPS. Ignoring time of RTC quality of NTP
INFO  | 02:51:42 416 [Router] updatePosition REMOTE node=0x1c498f44 time=1718074303, latI=<redacted>, lonI=<redacted>
DEBUG | 02:51:42 416 [Router] Node status update: 1 online, 3 total
DEBUG | 02:51:42 416 [Router] Module 'position' considered
DEBUG | 02:51:42 416 [Router] Module 'routing' wantsPacket=1
INFO  | 02:51:42 417 [Router] Received routing from=0x0, id=0x89737586, portnum=3, payloadlen=17
DEBUG | 02:51:42 417 [Router] Routing sniffing (id=0x89737586 fr=0x00 to=0x44, WantAck=0, HopLim=3 Ch=0x0 Portnum=3 rxtime=1718074302 priority=10)
DEBUG | 02:51:42 417 [Router] Module 'routing' considered
INFO  | 02:51:42 417 Telling client we have new packets 21
INFO  | 02:51:42 417 BLE notify fromNum
INFO  | 02:51:42 417 From Radio onread
INFO  | 02:51:42 417 getFromRadio=STATE_SEND_PACKETS
DEBUG | 02:51:42 417 encoding toPhone packet to phone variant=11, 12 bytes
INFO  | 02:51:43 417 From Radio onread
DEBUG | 02:51:46 421 [GPS] WANT GPS=1
DEBUG | 02:51:46 421 [GPS] GPS::powerState=1
DEBUG | 02:51:47 421 [Power] Battery: usbPower=1, isCharging=0, batMv=0, batPct=0
DEBUG | 02:51:47 421 [GPS] WANT GPS=0
DEBUG | 02:51:47 421 [GPS] GPS Lock took 0, average 0
DEBUG | 02:51:47 421 [GPS] GPS::powerState=3
DEBUG | 02:51:47 421 [GPS] publishing pos@6667bbc4:2, hasVal=1, Sats=6, GPSlock=1
DEBUG | 02:51:47 421 [GPS] New GPS pos@6667bbc4:3 lat=<redacted>, lon=<redacted>, alt=43, pdop=2.32, track=0.00, speed=0.00, sats=6
DEBUG | 02:51:47 421 [GPS] onGPSChanged() pos@6667bbc4, time=1718074307, lat=<redacted>, lon=<redacted>, alt=43
INFO  | 02:51:47 421 [GPS] updatePosition LOCAL pos@6667bbc4, time=1718074307, latI=<redacted>, lonI=<redacted>, alt=43
DEBUG | 02:51:47 421 [GPS] Setting local position: latitude=<redacted>, longitude=<redacted>, time=1718074307, timestamp=1718074308
DEBUG | 02:51:47 421 [GPS] Node status update: 1 online, 3 total

Do you think you can try adding some extra logging on your machine through GPS::setAwake and the top of GPS::setGPSPower? Something must be slipping past me here and I just can't see it!

GPSFan commented 3 months ago

Those line numbers don't correspond to what I have, let me view your repo and compare it to what I've got.

GPSFan commented 3 months ago

That's it When I cloned your repo I didn't check out the frequent updates branch, I had looked at GPS.cpp and saw they were different so used your GPS.cpp and GPS.h. Let me fix that and re check. Apologies, stupid mistake on my part, shows you what a DayJob can do to your brain. Serial log looks like normal with 10 sec GPS Update Interval.

todd-herbert commented 3 months ago

Hey it's good we had to slow down and take a close look at it though or I would have missed that I had accidentally bypassed the code for "recovering from a long lock-time". I'll have a go at that later on and hopefully we can finish this off.

todd-herbert commented 3 months ago

Code which aimed to reduce the impact of outlier lock-times was probably affected by https://github.com/meshtastic/firmware/commit/da5bca31edcd091b33d1e6b9b35ea6f807850eb0. It think it may have relied on the fact that there was no distinction between what is now GPSPowerState::OFF and GPSPowerState::STANDBY.

The previous system kept track of the "average lock time", and manipulated this manually if it became exceedingly long. Rather than re-implementing this, I found it conceptually easier to move to an exponential smoothing system. This is the same mechanism we are using to smooth ADC readings.

The "predicted lock-time" is affected gradually by new lock-time values. A 20% weighting seems to give a reasonable balance between stability and response, but this could certainly be adjusted.

prediction = (newLockTime * 0.2) + (prediction * 0.8);

This does go beyond the PR's original scope of"fixing a bug". Ideally I would have avoided changing behavior, but this did feel like the easiest way to build back in the feature of handling these outliers.

Please let me know if you have concerns, and we can rework this.

As some assurance, I've compared the difference between the proposed "exponential smoothing" system, and the old "averageLockTime" system. Test conditions: T-Beam V1.2, indoors, 120s update interval, 80 minutes runtime.

chart: comparing lock-time prediction methods

chart: comparing lock-time prediction methods; difference from target position.gps_update_interval


If possible, I think it'd be really valuable to hear from jp-bennett about everything too. I probably should have gone straight to them for advice at the beginning; sorry about that.

Reading this comment again, I'm worried that it could have come across wrong. Just want to be clear that I didn't want to minimize anyone's input; I had just noticed in the git history that JP Bennet had worked on this area of the code a few months ago.

jp-bennett commented 3 months ago

@todd-herbert keeping an eye on this. Need to give it a test still. The initial rolling average method was crude, but generally worked. Glad to see something a bit more sophisticated. I'd like to take some time to dive into this, but don't wait on me to pull it.

GPSFan commented 3 months ago

Making sure I got the proper code this time ;>)) I'm seeing expected behavior with GPS Update Interval at 10 sec. Maybe I should explain my use case a bit. I need the GPS to NEVER sleep. I need it to be in continuous tracking all the time. I have no power consumption requirements as the node is vehicle powered and the Meshtastic mesh I am implementing is limited to 4-8 nodes max. Accurate positions are favored over many things that other use cases regard as important. I am sure that my use case can co-exist with others.

One thing I noticed is this code does not send a PMREQ with duration 0 to the GPS when Deep Sleep is initiated. I have been tracking down why the GPS kept running after this (in master) for a while and have found that the ESP32 goes into Deep Sleep before the command is finished being sent to the GPS. In addition in Deep Sleep the TX signal to the GPS goes to a low state which wakes the GPS up (if it were asleep) and caused massive frameing errors. Haven't done any testing with NRF52 as yet.

thebentern commented 3 months ago

Making sure I got the proper code this time ;>)) I'm seeing expected behavior with GPS Update Interval at 10 sec. Maybe I should explain my use case a bit. I need the GPS to NEVER sleep. I need it to be in continuous tracking all the time. I have no power consumption requirements as the node is vehicle powered and the Meshtastic mesh I am implementing is limited to 4-8 nodes max. Accurate positions are favored over many things that other use cases regard as important. I am sure that my use case can co-exist with others.

Perhaps any interval under a 15 second threshold should just belay any want-sleep actions?

todd-herbert commented 3 months ago

Making sure I got the proper code this time ;>)) I'm seeing expected behavior with GPS Update Interval at 10 sec. Maybe I should explain my use case a bit. I need the GPS to NEVER sleep. I need it to be in continuous tracking all the time. I have no power consumption requirements as the node is vehicle powered and the Meshtastic mesh I am implementing is limited to 4-8 nodes max. Accurate positions are favored over many things that other use cases regard as important. I am sure that my use case can co-exist with others.

Perhaps any interval under a 15 second threshold should just belay any want-sleep actions?

I believe this is already the case whenever the GPS would sleep less than 10 seconds. Just from memory, I think that behavior has been there since the GPS power saving stuff went initially (9 months ago?). I've exposed that ten second threshold as a macro at the top of GPS.cpp if you want to try adjusting it. I can adjust it right now in the PR too if you'd like a change.

GPSFan commented 3 months ago

That is the current behavior with 10 sec or less.

GPSFan commented 3 months ago

With many DIY devices, the GPS is powered without a dedicated GPS power switch, sending the PMREQ would achieve a Deep Sleep like power reduction for those u-blox GPS devices. I believe that was the original intent, but with the Tx staying low and the command not completing, that doesn't work.

todd-herbert commented 3 months ago

With many DIY devices, the GPS is powered without a dedicated GPS power switch, sending the PMREQ would achieve a Deep Sleep like power reduction for those u-blox GPS devices. I believe that was the original intent, but with the Tx staying low and the command not completing, that doesn't work.

I'll admit, I haven't dived super deep into all that, and I'm not sure I'm set up to test it here, but it's definitely something we need to sort out. I'll go take a look and see if I can spot anything obvious going wrong.

GPSFan commented 3 months ago

Does your code intend to send a PMREQ on Deep Sleep? If so, IMHO that is the first problem to solve. I watched the serial Rx line of the GPS with a USB-Serial module with Minicom set to display Hex and wrap long lines, so I could see what was sent to the GPS.The ESP32 line going low can be fixed using gpio_hold_xxx commands, but this may expose a bigger issue of many different gpios are used for GPS Rx & Tx in various devices, How deal with that as well as how other peripherals might deal with floating outputs in Deep Sleep.

todd-herbert commented 3 months ago

One thing I noticed is this code does not send a PMREQ with duration 0 to the GPS when Deep Sleep is initiated. I have been tracking down why the GPS kept running after this (in master) for a while and have found that the ESP32 goes into Deep Sleep before the command is finished being sent to the GPS. In addition in Deep Sleep the TX signal to the GPS goes to a low state which wakes the GPS up (if it were asleep) and caused massive frameing errors. Haven't done any testing with NRF52 as yet.

Actually, is this something that got broken just now, or something that's been around for a while? If it's a separate issue, I might be keen to wrap this bug-fix up and tackle the PMREQ TX pin thing separately in the next few days? It'd definitely be top of my pile to work on, while all this GPS stuff is still fresh in my memory. Bonus: if something goes horribly wrong, it'll be easier to spot which PR did it..

GPSFan commented 3 months ago

Your previous version of the GPS.cpp did try and send a PMREQ and that code is still there, but I didn't see a PMREQ being sent when I put my device into Deep Sleep with the latest version. It could be that the ESP32 went to sleep sooner and the command never even started to be sent. The ESP32 going to sleep before the command finishes is another problem and could be addressed in a different "issue".

todd-herbert commented 3 months ago

I'm going to call it a night for now, but I've taken a shot-in-the-dark at getting that duration 0 PMREQ to send. I don't know if there will still be an issue with the MCU sleeping too soon, and nothing yet about holding the GPS_TX pin high, but do you think you could check if this one change is enough to get the PMREQ to send?

GPSFan commented 3 months ago

I'll try to check it out before DayJob calls...

todd-herbert commented 3 months ago

No rush, I'll be back tomorrow 👋

GPSFan commented 3 months ago

Nope, no PMREQ at DeepSleep. ;>((

GPSFan commented 3 months ago

The Log entry "Sleep Time 0" is missing, Line 858.

todd-herbert commented 3 months ago

I think I've pulled out a bit of code that really needs to go back too. I'd better stop making ill-advised guesses now and wait until I can put everything back where it's meant to be, and look at it again properly. Sorry about all that.

GPSFan commented 3 months ago

Hey, this isn't quite a "Ready, Fire, Aim" situation, so don't sweat it...

todd-herbert commented 3 months ago

I'm just looking taking a closer look at the GPS power stuff. It feels like it's been tacked on-to (exactly like we're doing here) for the past four years. I'm struggling to come up with an elegant code-path through the layers of previous fixes and features. It's also leading me back into Sleep.cpp, and there's a little bit of the same thing happening there with the Observables.

Rather than bore a new hole through the code to get the GPS hardware deep-sleeping, I'm more and more tempted to suggest leaving this PR where it is, as a temporary bug fix, and planning out a major refactoring of the GPS power stuff, if that's something @jp-bennett, @thebentern, and @GPSFan would support.

I'm not at all familiar with the low-level magic that goes on to support all the different GPS hardware, so it's something I'd need a lot of help / input with, unless anyone else is up to the task and wants to take charge. I'm more than happy to contribute to this, within my ability.

Not a design pattern I've ever used before, but I could imagine something very similar to the PowerFSM working well here. It could greatly simplify the logic when moving all this hardware between these different states.

Dare I say it, I think Sleep.cpp might need another tidy up too to support this. Some of the Observables are being notified twice, never notified at all, or followed up with duplicate code inside of Sleep.cpp itself. I'd be quite nervous to start tidying there though, given how badly it went last time I tried. Part of supporting this GPS power-saving would be manipulating GPIOs for deep-sleep, tidying or no tidying, which means it'd want a whole lot of very careful testing / oversight either way.

jp-bennett commented 3 months ago

A temporary fix now and bigger refactor later is the way to go. But PowerFSM is actually terrible and is one of the things that kills our ram allocation. It uses a bunch of realloc()s internally, that just wastes memory.

GPSFan commented 3 months ago

I agree, a temp fix now and then a deep look at the whole GPS power management/sleep/wake structure. Different manufacturers manage power in different ways. with some, the only option is to cut power when ultra low power is required. Some use cases should never pair a low power processor with a power hungry GPS, (nrf52 and u-blox Neo-6) when there are options, but neither diy folks or manufacturers can be relied on to make the best choices. That includes me.

todd-herbert commented 3 months ago

A temporary fix now and bigger refactor later is the way to go. But PowerFSM is actually terrible and is one of the things that kills our ram allocation. It uses a bunch of realloc()s internally, that just wastes memory.

That's actually really good insight, and something I'd never realised. Thanks!

I'll keep playing with it as is, and see if there's a middle ground to at least get the that GPS deep sleep working.

Sorry, I didn't pay pay quite enough attention to what your messages were saying. Let's wrap this fix up and then look at the power-saving stuff soon.