InfiniTimeOrg / InfiniTime

Firmware for Pinetime smartwatch written in C++ and based on FreeRTOS
GNU General Public License v3.0
2.64k stars 904 forks source link

Improve weather service #1822

Closed faxe1008 closed 6 months ago

faxe1008 commented 11 months ago

I wanted to investigate this issue: https://github.com/InfiniTimeOrg/InfiniTime/issues/1788

I found some things that could be refactored, and also some issues with the current implementation:

Further investigation needs to be done here:

github-actions[bot] commented 11 months ago
Build size and comparison to main: Section Size Difference
text 412412B 3112B
data 996B 0B
bss 63372B 0B
Avamander commented 11 months ago

Already expired events are added to the timeline and deleted right afterwards. This causes unnecessary alloc/free churn and I suspect more heap fragmentation.

Great point.

Is the CBOR code (especially the one using UsefullBufC) leaking memory? (Couldnt figure out that quickly what the library expects in terms of cleanup)

It really shouldn't, I did try and make sure everything is checked, closed and cleaned up (according to the documentation). Adding timeline support to InfiniSim using the same library might shed some light though.

It's way more likely there's simply not enough truly free memory to actually accommodate the currently allowed timeline size.

Maybe using a fixed size array and no heap allocation? This will increase the flash size, but there is less potential for runtime alloc errors?

I did initially consider an array per type of event (because different types of events have different-sized data structures), but that's IMO way more wasteful.

faxe1008 commented 11 months ago

It's way more likely there's simply not enough truly free memory to actually accommodate the currently allowed timeline size.

Yeah, but I think part of the issue is also that currently there are instances where we essentially allocate twice the memory we need, e.g. vector is at capacity and a new item is pushed. imo it would be more transparent to have the containers be statically allocated and be smarter about when to add events. Sure this will increase flash size, but when the alternative is to have alloc fails when loading font or something similar I'd rather not have the most up to date weather :^).

I did initially consider an array per type of event (because different types of events have different-sized data structures), but that's IMO way more wasteful.

Maybe I'll give that idea a spin. Again this will use more flash, but it will be compile time visible how much the weather service actually needs.

Avamander commented 11 months ago

Sure this will increase flash size, but when the alternative is to have alloc fails when loading font or something similar

Not just flash though, it will consume that amount of RAM (permanently). Which is already spread really quite thin.

I'd rather not have the most up to date weather :^).

The intent is to extend the timeline to contain events of other types as well. Such as calendar events or reminders, but we'll have to carefully consider if we can afford that.

faxe1008 commented 11 months ago

The intent is to extend the timeline to contain events of other types as well. Such as calendar events or reminders, but we'll have to carefully consider if we can afford that.

Would love that one. However I think it might be smart to have a fixed max size for the vector, thats tried and tested to leave some breathing room for other allocations :^). Also maybe there is a way to tune the std::vector behaviour in regards to load and growth factor to be more conservative somehow? My knowledge of the STL is a bit rusty.

faxe1008 commented 10 months ago

@Avamander So what I found is that at most the TinyWeatherForeCast app pushed ~98 events to the timeline, increasing the vector capacity to 128. Also what I saw is that if you hit the refresh button in the app the events are pushed to the watch again even if the values did not change.

I think to validate strategies the best course of action would be to gather "real life" data at what point in time the app pushes which data and do some simulations on the vector behaviour. I checked the app and it offers protocolling functionality but it does not log the expiration and the fields "verbatim" as the bluetooth communication uses it. There is some work to be done to change the logging a tad bit.

Things that I think of adding:

What do you think?

minacode commented 10 months ago

I have a (semi-informed) question: Could you put all the stuff from the timeline into files, store only the most necessary metadata, eg. timestamp and file id and then read everything else dynamically from external memory when needed?

faxe1008 commented 10 months ago

@minacode any input is welcome :^). I think this is maybe possible, to store the data on the external flash. My thoughts on this:

Eventually if we ever retrieve calender events storing the to external flash is promising so they would persist across reboots. For the weather I am unsure if that makes sense.

@Avamander @JF002 @minacode I think that for the weather data we need some clear target to go for. Should we only care about the most current not expired event? Any thoughts on this?

JF002 commented 10 months ago

Thanks everyone for this analysis so far! I'm now joining the party!

I've just had a look at WeatherService and WeatherData while keeping in mind the memory issues that were reported regarding the weather feature. I also found a few points in the implementation that could explain those issues:

<info> app: Size TimelineHeader, 16
<info> app: Size Location, 56
<info> app: Size Clouds, 24
<info> app: Size Obscuration, 24
<info> app: Size Precipitation, 24
<info> app: Size Wind, 24
<info> app: Size Temperature, 24
<info> app: Size Humidity, 24
<info> app: Size Pressure, 24
<info> app: Size AirQuality, 48

All in all, we can see that WeatherService is quite hungry in RAM space, and that it'll use as much memory as needed, potentially at the expense of other functionalities (like fonts and watchfaces).

So, how can we fix/improve this?

I know that this timeline implementation was designed to be as generic as possible. We are currently using it to store weather data, but the bigger plan was to use it for other functionalities like navigation directions, agenda events,... which is a good idea!

However, now that we face those memory issues, the current implementation does not seem optimal due to its memory usage and the limited amount of RAM memory available on the PineTime.

I guess we can explore a few options

As an example, I designed the notification functionality with strong limitations : max 5 notifications of max 100 characters. This implementation is not the most effective one since we cannot store 10 notifications of 15 characters even if there's theoretically enough room for them, but it's at least simple and deterministic : it uses 500B of RAM memory. And since this memory is statically allocated, it'll never interfere with another feature.

I'm not saying we should implement the weather feature this way, but I think that compromises and limitations are probably unavoidable here.

Let's have a look at the current usage of the weather data in InfiniTime : we only want to display the current temperature and precipitation. The timeline allows the companion app to send multiple events (of the same time) with different expiry time, but we currently use only the 1st valid event available in the timeline.

I know that we could use more functionalities and data in the future, but maybe we should think about which data we'll realistically use. Do we really think we'll display all these data? Or, put another way, maybe we should decide which data we really want to handle, and remove the other ones.

I'm also wondering why would the weather companion apps send 98 events for a single update? Do all these events contains actual data? Even if I think InfiniTime should be able to "defend" itself against a companion app sending too much data, there are maybe a few improvements to be done on the companion app side?

faxe1008 commented 10 months ago

I'm also wondering why would the weather companion apps send 98 events for a single update?

Sorry if I did not phrase it correctly, but what I meant is that at one point in time there were 98 elements in the vector. A single update from the TWFG app yields ~6-9 events.

I know that we could use more functionalities and data in the future, but maybe we should think about which data we'll realistically use. Do we really think we'll display all these data? Or, put another way, maybe we should decide which data we really want to handle, and remove the other ones.

+1 for me on this one. For now I think keeping the value closest to the current time would be sufficient, but I do not know if we reject the others if the apps pushes previous values again?

Avamander commented 10 months ago

So what I found is that at most the TinyWeatherForeCast app pushed ~98 events to the timeline, increasing the vector capacity to 128. Also what I saw is that if you hit the refresh button in the app the events are pushed to the watch again even if the values did not change.

That seems to be indeed quite excessive and ideally we'd handle this misbehaviour.

Deduplicate the events, meaning that the same event can not be added twice

That would be a nice addition.

Eventually if we ever retrieve calender events storing the to external flash is promising so they would persist across reboots. For the weather I am unsure if that makes sense.

Especially with longer string content (descriptions, full titles) this seems like a logical approach.

Though it wouldn't hurt too much to periodically flush the entire timeline to storage, deallocate the entire list and then reallocate. This could reduce fragmentation, in theory.

Avoid padding using #pragma pack

This seems like a fairly simple and safe change, though it might need some testing.

it uses 500B of RAM memory. And since this memory is statically allocated, it'll never interfere with another feature.

That's a large allocation for a feature that might not be used. It would also mean a custom allocator (considering the different-sized elements), which is a significant amount of complexity.

there are maybe a few improvements to be done on the companion app side?

Certainly. Though it seems quite weird there are that many events at the same time. Has anyone gotten a dump of all the events?

I know that we could use more functionalities and data in the future, but maybe we should think about which data we'll realistically use. Do we really think we'll display all these data? Or, put another way, maybe we should decide which data we really want to handle, and remove the other ones.

We could possibly discard the ones we currently certainly don't use, but the compatibility exists for the explicit reason to make it possible to write things that do use this data. Plus this wide compatibility has now revealed certain assumptions, such as well-behaving companions, to be false. I wouldn't make large alternations to our current approach, rather fortify it against abuse (upper sanity limits), improve storage (deduplication, packing) and data usage (which events do we use, clearer instructions about which data should be pushed and how).

faxe1008 commented 10 months ago

Since discussing abstract ideas is quite difficult I opted to go for a more methodical approach. I want to actually see what is going on in regards to memory consumption etc. without having to build the watch firmware, reflash it and wait.

I modified the Gadgetbridge app to log the data send to the watch as a json. In parallel I started work on a benchmarking simulator for the behaviour where I can plot different implementations against each other: https://github.com/faxe1008/InifiniteWeather

The idea is to take the heap allocator from infinitime and make it available as a C++ allocator to be used with STL-containers and smart-pointers. Overriding the malloc/free symbols globally I avoided on purpose since this might capture unwanted test infrastructure noise. Anyways the plan is to feed the captured data into the simulators which will yield effectively this:

Strategy: SimpleVectorTimeline
Allocations:   1
Deallocations: 0
OOM:           0
Lowest Ever:   40920
Fragmentation: 0
Leaking:       false

This will give an actual point of comparison for different implementations, which should be close to (if not identical) to the watches behaviour.

JF002 commented 9 months ago

Thanks for continuing the analysis! Your new testing method will probably provide valuable data!

I do plan on having a closer look on this issue soon so we can try to figure out the best solution for this issue!

JF002 commented 9 months ago

I added a few debug logs to see what exactly happens when weather data is received by WeatherService. For each event, I log the following info:

<info> app: [Weather Service] Received weather command : 
        Timestamp : 1694357103
        Expiration(h) : 6
        Type : Location
<info> app: [Weather Service] Memory : free : 13376 - min free : 13080, malloc errors : 0 - stack overflow errors : 0
<info> app: [FreeRTOS HEAP] ALLOC 64B -> 536896376
<info> app: [FreeRTOS HEAP] ALLOC 16B -> 536896440
<info> app: [Weather Service] AddEventToTimeline - Current size: 1 - Max size: 536870911

Here is the complete patch with those logs : Debug_weather_service.patch.txt

I ran the following tests using Gadgetbridge 0.75.1 and Tiny Weather Forecast Germany 0.61.0.

Shortly after connecting the watch, 7 events are received and 296B were allocated :

<info> app: -------------------------------
<info> app: [Weather Service] Received weather command : 
        Timestamp : 1694357103
        Expiration(h) : 6
        Type : Location
<info> app: [Weather Service] Memory : free : 13376 - min free : 13080, malloc errors : 0 - stack overflow errors : 0
<info> app: [FreeRTOS HEAP] ALLOC 64B -> 536896376
<info> app: [FreeRTOS HEAP] ALLOC 16B -> 536896440
<info> app: [Weather Service] AddEventToTimeline - Current size: 1 - Max size: 536870911
<info> app: -------------------------------
<info> app: [Weather Service] Received weather command : 
        Timestamp : 1694357103
        Expiration(h) : 6
        Type : Humidity
<info> app: [Weather Service] Memory : free : 13296 - min free : 13080, malloc errors : 0 - stack overflow errors : 0
<info> app: [FreeRTOS HEAP] ALLOC 32B -> 536896456
<info> app: [FreeRTOS HEAP] ALLOC 16B -> 536896488
<info> app: [FreeRTOS HEAP] Free 536896440
<info> app: -------------------------------
<info> app: [Weather Service] Received weather command : 
        Timestamp : 1694357103
        Expiration(h) : 6
        Type : Temperature
<info> app: [Weather Service] Memory : free : 13264 - min free : 13080, malloc errors : 0 - stack overflow errors : 0
<info> app: [FreeRTOS HEAP] ALLOC 32B -> 536896504
<info> app: [FreeRTOS HEAP] ALLOC 24B -> 536896536
<info> app: [FreeRTOS HEAP] Free 536896488
<info> app: -------------------------------
<info> app: [Weather Service] Received weather command : 
        Timestamp : 1694357103
        Expiration(h) : 24
        Type : Temperature
<info> app: [Weather Service] Memory : free : 13224 - min free : 13080, malloc errors : 0 - stack overflow errors : 0
<info> app: [FreeRTOS HEAP] ALLOC 32B -> 536896560
<info> app: [Weather Service] AddEventToTimeline - Current size: 4 - Max size: 536870911
<info> app: -------------------------------
<info> app: [Weather Service] Received weather command : 
        Timestamp : 1694357103
        Expiration(h) : 6
        Type : Wind
<info> app: [Weather Service] Memory : free : 13192 - min free : 13080, malloc errors : 0 - stack overflow errors : 0
<info> app: [FreeRTOS HEAP] ALLOC 32B -> 536896592
<info> app: [FreeRTOS HEAP] ALLOC 40B -> 536896664
<info> app: [FreeRTOS HEAP] Free 536896536
<info> app: -------------------------------
<info> app: [Weather Service] Received weather command : 
        Timestamp : 1694357103
        Expiration(h) : 6
        Type : Precipitation
<info> app: [Weather Service] Memory : free : 13136 - min free : 13080, malloc errors : 0 - stack overflow errors : 0
<info> app: [FreeRTOS HEAP] ALLOC 32B -> 536896704
<info> app: [Weather Service] AddEventToTimeline - Current size: 6 - Max size: 536870911
<info> app: -------------------------------
<info> app: [Weather Service] Received weather command : 
        Timestamp : 1694357103
        Expiration(h) : 6
        Type : Clouds
<info> app: [Weather Service] Memory : free : 13104 - min free : 13080, malloc errors : 0 - stack overflow errors : 0
<info> app: [FreeRTOS HEAP] ALLOC 32B -> 536896736
<info> app: [Weather Service] AddEventToTimeline - Current size: 7 - Max size: 536870911

Then I hit the refresh button in TWFG: 13 events & 552B

<info> app: -------------------------------
<info> app: [Weather Service] Received weather command : 
        Timestamp : 1694357117
        Expiration(h) : 6
        Type : Location
<info> app: [Weather Service] Memory : free : 13072 - min free : 13072, malloc errors : 0 - stack overflow errors : 0
<info> app: [FreeRTOS HEAP] ALLOC 64B -> 536896768
<info> app: [Weather Service] AddEventToTimeline - Current size: 8 - Max size: 536870911
<info> app: -------------------------------
<info> app: [Weather Service] Received weather command : 
        Timestamp : 1694357117
        Expiration(h) : 6
        Type : Humidity
<info> app: [Weather Service] Memory : free : 13008 - min free : 13008, malloc errors : 0 - stack overflow errors : 0
<info> app: [FreeRTOS HEAP] ALLOC 32B -> 536896832
<info> app: [FreeRTOS HEAP] ALLOC 72B -> 536896864
<info> app: [FreeRTOS HEAP] Free 536896664
<info> app: -------------------------------
<info> app: [Weather Service] Received weather command : 
        Timestamp : 1694357117
        Expiration(h) : 6
        Type : Temperature
<info> app: [Weather Service] Memory : free : 12944 - min free : 12904, malloc errors : 0 - stack overflow errors : 0
<info> app: [FreeRTOS HEAP] ALLOC 32B -> 536896664
<info> app: [Weather Service] AddEventToTimeline - Current size: 10 - Max size: 536870911
<info> app: -------------------------------
<info> app: [Weather Service] Received weather command : 
        Timestamp : 1694357117
        Expiration(h) : 24
        Type : Temperature
<info> app: [Weather Service] Memory : free : 12904 - min free : 12904, malloc errors : 0 - stack overflow errors : 0
<info> app: [FreeRTOS HEAP] ALLOC 32B -> 536896936
<info> app: [Weather Service] AddEventToTimeline - Current size: 11 - Max size: 536870911
<info> app: -------------------------------
<info> app: [Weather Service] Received weather command : 
        Timestamp : 1694357117
        Expiration(h) : 6
        Type : Wind
<info> app: [Weather Service] Memory : free : 12872 - min free : 12872, malloc errors : 0 - stack overflow errors : 0
<info> app: [FreeRTOS HEAP] ALLOC 32B -> 536896968
<info> app: [Weather Service] AddEventToTimeline - Current size: 12 - Max size: 536870911
<info> app: -------------------------------
<info> app: [Weather Service] Received weather command : 
        Timestamp : 1694357117
        Expiration(h) : 6
        Type : Precipitation
<info> app: [Weather Service] Memory : free : 12840 - min free : 12840, malloc errors : 0 - stack overflow errors : 0
<info> app: [FreeRTOS HEAP] ALLOC 32B -> 536897000
<info> app: [Weather Service] AddEventToTimeline - Current size: 13 - Max size: 536870911
<info> app: -------------------------------
<info> app: [Weather Service] Received weather command : 
        Timestamp : 1694357117
        Expiration(h) : 6
        Type : Clouds
<info> app: [Weather Service] Memory : free : 12808 - min free : 12808, malloc errors : 0 - stack overflow errors : 0
<info> app: [FreeRTOS HEAP] ALLOC 32B -> 536897032
<info> app: [Weather Service] AddEventToTimeline - Current size: 14 - Max size: 536870911
<info> app: -------------------------------
<info> app: [Weather Service] Received weather command : 
        Timestamp : 1694357118
        Expiration(h) : 6
        Type : Location
<info> app: [Weather Service] Memory : free : 12776 - min free : 12776, malloc errors : 0 - stack overflow errors : 0
<info> app: [FreeRTOS HEAP] ALLOC 64B -> 536897064
<info> app: [Weather Service] AddEventToTimeline - Current size: 15 - Max size: 536870911
<info> app: -------------------------------
<info> app: [Weather Service] Received weather command : 
        Timestamp : 1694357118
        Expiration(h) : 6
        Type : Humidity
<info> app: [Weather Service] Memory : free : 12712 - min free : 12712, malloc errors : 0 - stack overflow errors : 0
<info> app: [FreeRTOS HEAP] ALLOC 32B -> 536897128
<info> app: [Weather Service] AddEventToTimeline - Current size: 16 - Max size: 536870911
<info> app: -------------------------------
<info> app: [Weather Service] Received weather command : 
        Timestamp : 1694357118
        Expiration(h) : 6
        Type : Temperature
<info> app: [Weather Service] Memory : free : 12680 - min free : 12680, malloc errors : 0 - stack overflow errors : 0
<info> app: [FreeRTOS HEAP] ALLOC 32B -> 536897160
<info> app: [FreeRTOS HEAP] ALLOC 136B -> 536897192
<info> app: [FreeRTOS HEAP] Free 536896864
<info> app: -------------------------------
<info> app: [Weather Service] Received weather command : 
        Timestamp : 1694357118
        Expiration(h) : 24
        Type : Temperature
<info> app: [Weather Service] Memory : free : 12584 - min free : 12512, malloc errors : 0 - stack overflow errors : 0
<info> app: [FreeRTOS HEAP] ALLOC 32B -> 536896864
<info> app: [Weather Service] AddEventToTimeline - Current size: 18 - Max size: 536870911
<info> app: -------------------------------
<info> app: [Weather Service] Received weather command : 
        Timestamp : 1694357118
        Expiration(h) : 6
        Type : Wind
<info> app: [Weather Service] Memory : free : 12552 - min free : 12512, malloc errors : 0 - stack overflow errors : 0
<info> app: [FreeRTOS HEAP] ALLOC 32B -> 536896896
<info> app: [Weather Service] AddEventToTimeline - Current size: 19 - Max size: 536870911
<info> app: -------------------------------
<info> app: [Weather Service] Received weather command : 
        Timestamp : 1694357118
        Expiration(h) : 6
        Type : Precipitation
<info> app: [Weather Service] Memory : free : 12512 - min free : 12512, malloc errors : 0 - stack overflow errors : 0
<info> app: [FreeRTOS HEAP] ALLOC 32B -> 536897328
<info> app: [Weather Service] AddEventToTimeline - Current size: 20 - Max size: 536870911
<info> app: -------------------------------
<info> app: [Weather Service] Received weather command : 
        Timestamp : 1694357118
        Expiration(h) : 6
        Type : Clouds
<info> app: [Weather Service] Memory : free : 12480 - min free : 12480, malloc errors : 0 - stack overflow errors : 0
<info> app: [FreeRTOS HEAP] ALLOC 32B -> 536897360
<info> app: [Weather Service] AddEventToTimeline - Current size: 21 - Max size: 536870911

With those test, I can confirm that

But the following points struck me first:

With those observations in mind, I think we do a few easy things to reduce and keep under control the memory usage of the weather service:

This should fix the issue on the short term : the memory usage will be reduced by a lot and should not grow uncontrollably. And this will give us more time to improve the current implementation to reduce the number of allocations, the storage overhead, and ensure that the memory usage stays under control when we add new event types.

What do you think of those suggestions?

JF002 commented 6 months ago

Thanks for your work, @faxe1008. However, I decided to rework the weather service in #1924 so we can fix those issues more easily. So I'm closing this one in favor of #1924.