[v32.0-alpha] answering quests sometimes really slow

Helium314 commented 3 years ago

After downloading a large area containing many quests, answering some quests takes very long (in my case this was done automatically after first app start, but also happens when zooming out and starting a manual scan). Especially for building or crossing quests, on my S4 Mini it takes some 30 seconds to for the pin to be removed. In this time I can't open any other quests, SC usage is basically blocked (panning/zooming the map is still possible without noticable slowdowns). Other quests like opening hours do not have this problem. Interestingly, answering quests near the southern edge of the scanned area is much faster than in other regions.

When "leaving" this area, i.e. scrolling somewhere far away, manually scanning and answering quests is fast (less than a second). The southern edge of the new area might be a little bit faster, not sure.

Sometimes after doing stuff "somewhere else", anwering quests inside the initial scan area is faster (maybe 10 seconds instead of 30 for the same element), but still slower than outside. I haven't yet found a way of reproducing this.

westnordost commented 3 years ago

Anyway, I guess there are two possibilities how to solve this:

Imitate OSM Api

When fetching all elements within a bbox, it actually means to 1. fetch all nodes within that bbox, 2. fetch all ways that have one of these nodes and 3. fetch all relations that have either any of these nodes or ways as a member.

This solves the problem that extremely spacious relations that thus have a very large bbox (and it looks like there are many spacious relations) are only included if there is actually a member of it in the bbox itself.

Also, as it mimics how the OSM Api download map bbox call works, it is consistent.

Downside is, that the case described in https://github.com/streetcomplete/StreetComplete/issues/2803#issuecomment-827639718 becomes more likely.

All relations except a few don't get a bbox

So if when persisting relations, only a few relation types, like multipolygons, will also persist their bbox, this results in the fetch-elements-in-bbox call to never return these relations. Those relations still remain in the DB and they can be fetched for things like splitting the way (and the necessary update of certain relations). Just for re-checking quest eligibilities, they are not available.

The advantage is that the behavior we have now can mostly stay like this. The downside is, that quests that look for (non-multipolygon) relations will not work anymore. Or more precisely, will still work on downloading, but when rechecking quests, the quests will vanish. One quest that comes to mind is the AddSummitRegister quest.

Anyway, which quests are affected by this anyway?: AddSummitRegister, AddCycleway and AddSidewalk.

I should note that the general problem as shown here...

...already exists anyway. Only just that the download bounding box is much larger than the "quest recheck box" and thus this issue surfaces much less frequently.

smichel17 commented 3 years ago

that the download bounding box is much larger than the "quest recheck box" and thus this issue surfaces much less frequently

Maybe this suggests a 3rd variant: the osm api approach, but using a larger bbox as a buffer, where data is checked/downloaded but not itself used to generate/show quests.

For example, if we had a "cycleway width" quest, in this diagram it would not be shown because the cycleway nodes are outside the quest scan bbox, but those nodes would still be used to exclude the road from the cycleway quest, because they're inside the buffer.

westnordost commented 3 years ago

How big should the buffer be?

Am 30. April 2021 02:22:48 MESZ schrieb smichel17 @.***>:

that the download bounding box is much larger than the "quest recheck box" and thus this issue surfaces much less frequently

Maybe this suggests a 3rd variant: the osm api approach, but using a larger bbox as a buffer, where data is checked/downloaded but not itself used to generate/show quests.

For example, if we had a "cycleway width" quest, in this diagram it would not be shown because the cycleway nodes are outside the quest scan bbox, but those nodes would still be used to exclude the road from the cycleway quest, because they're inside the buffer.

-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/streetcomplete/StreetComplete/issues/2803#issuecomment-829713533

-- Diese Nachricht wurde von meinem Android-Gerät mit K-9 Mail gesendet.

matkoniecz commented 3 years ago

What about option of doing what OSM API does

fetch all nodes within that bbox, 2. fetch all ways that have one of these nodes and 3. fetch all relations that have either any of these nodes or ways as a member.

and add an extra step "fetch full geometries of all ways (including intersecting ones, without node inside!)", without fetching full geometry of all relations?

Anyway, which quests are affected by this anyway?: AddSummitRegister, AddCycleway and AddSidewalk.

And AddStreetName that excludes elements within associated street relation.

And AddFerryAccessPedestrian, AddFerryAccessMotorVehicle ( https://github.com/streetcomplete/StreetComplete/tree/master/app/src/main/java/de/westnordost/streetcomplete/quests/ferry ) - but ferry routes usually have geometries that are not insane

AddHousenumber - https://github.com/streetcomplete/StreetComplete/blob/master/app/src/main/java/de/westnordost/streetcomplete/quests/housenumber/AddHousenumber.kt#L157 (what may be searched for here?)

The advantage is that the behavior we have now can mostly stay like this. The downside is, that quests that look for (non-multipolygon) relations will not work anymore. Or more precisely, will still work on downloading, but when rechecking quests, the quests will vanish. One quest that comes to mind is the AddSummitRegister quest.

Would it be possible to assume that their state is unchanged and protect them from vanishing? For example AddSummitRegister once generated is not going to be invalidated by edits within SC. The same would be true for AddCycleway, AddSidewalk, AddFerryAccessMotorVehicleAddFerryAccessPedestrian

AddStreetName is sad case, as it should be created by AddHousenumber - but associated street relations should be safe to fetch. Though there is a risk of megastreet with 20 000 houses on it... But on the other hand, for associated street fetching members just within bounding box is sufficient!

Helium314 commented 3 years ago

Would it be feasible to only put elements in the DB that are actually used in any quest? E.g. street lamps and boundaries are not used (I think)

Haha, well yes, that is feasible: Just use v31 and earlier. The whole point of the architecture change made for v32 was that the OSM data is kept independent of quests and quests are just generated from these.

I was still curious and tried excluding relations containing boundary= or route= from being put in the DB. Now the DB is half size after initial download (11.61 MB instead of 22.47), and when answering building type quest for "that house" 142 elements are fetched in 305 ms

Increasing DB size to 47 MB by downloading more areas slows things down a bit, but fetching still takes less than a second.

westnordost commented 3 years ago

Anyway, which quests are affected by this anyway?: AddSummitRegister, AddCycleway and AddSidewalk.

And [...]

I meant affected by the issue that sometimes, ways that are near other ways are not found due to not having a vertex in the same bbox.

westnordost commented 3 years ago

Hm, 11 MB just for those relations? There is not that much data on relations. One relation member in the table is

8 byte for id of relation
8 byte for id of ref
1 byte for index
X bytes for role string (often "" or "inner" or something)
3-8 bytes for name of ref type (NODE, WAY or RELATION)

So a relation with 1000 members will have about 32kb.

westnordost commented 3 years ago

@Helium314 I wonder, how does it affect the size in the db if the element type is not persisted as a string but as a number. I.e. in RelationDao,

line 51:

member.type.ordinal

line 80:

ElementType.values()[cursor.getInt(TYPE)]

line 123:

args = arrayOf(elementType.ordinal)) { it.getLong(ID) }.toSet()

and then in RelationTables, change line 35:

            ${Columns.TYPE} int NOT NULL,

Helium314 commented 3 years ago

normal DB: 291084 relation_members 420 relations

without routes and boundaries: 1953 relation_members 207 relations

So a relation with 1000 members will have about 32kb.

then 290000 members are about 9.3 MB, not so far from 11 MB

Helium314 commented 3 years ago

@Helium314 I wonder, how does it affect the size in the db if the element type is not persisted as a string but as a number. I.e. in RelationDao,

Not much effect, still 21.61 MB. Answering "that quest" was somewhat faster: 40 sec instead of the usual 43

westnordost commented 3 years ago

You mean whether saving this as a string or an int has not much of a space impact but about 10% speed impact? That's sounds a bit too much. enum.valueOf can't be that slow.

matkoniecz commented 3 years ago

Maybe just measurement noise? Is this difference persisting if you say restart phone and retry?

Helium314 commented 3 years ago

It might be noise, but in the recent attempts I was always very close to 43 seconds, that's why I mentioned it.

Additionally after having read a bit about CursorWindow, I would expect this to be faster: In my understanding CursorWindow: Window is full means that the results of the query don't fit into the 2 MB CursorWindow (huge relations!), so another query has to be done (and for some reason this is horribly slow). With a smaller DB, more rows fit into the CursorWindow. I got only 8 CursorWindow: Window is full message instead of the usual 9, which also points towards a small DB being a (small) improvement

Helium314 commented 3 years ago

retry after reboot: 39.6 sec

westnordost commented 3 years ago

Okay, let's not pursue that further. IMO it does not make a big difference whether to store it as a byte or as a string like "WAY"

westnordost commented 3 years ago

on the fetch-by-bbox branch I changed the method how the bbox is fetched (= mimic OSM API).

Do you want to try it out?

There is still something to do, but I first wanted to check if it makes any difference.

What's left to do if this works out:

could fetch the geometry of fetched elements on demand only. Leads to ~half the number of DB fetches
bbox needs to have a (larger) padding as suggested
maybe don't use ElementKey where it is not necessary

Also, the performance problem currently regarding the huge relations should also lead to very very slow download too, because also all geometry in the bbox a new download happens is fetched from DB (so also those huge relations). This change also addresses that.

Helium314 commented 3 years ago

Using fetch-by-bbox fetches 17 elements in 150ms for the same house, so definitely a difference.

westnordost commented 3 years ago

Heh, so it's 300 times faster?

Am 1. Mai 2021 08:30:08 MESZ schrieb Helium314 @.***>:

Using fetch-by-bbox fetches 17 elements in 150ms for the same house, so definitely a difference.

-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/streetcomplete/StreetComplete/issues/2803#issuecomment-830563914

-- Diese Nachricht wurde von meinem Android-Gerät mit K-9 Mail gesendet.

Helium314 commented 3 years ago

Those 17 (instead of 100+) elements probably don't contain any of those huge relations...

westnordost commented 3 years ago

Though 17 sound too little. JOSM fetches 29

westnordost commented 3 years ago

Just as a side note, that particular area is really problematic because there are so many huuuuuuge relations. Download of that area for me was like:

Downloaded 33185 nodes, 4774 ways and 420 relations in 48.4s
Persisted 38263 and deleted 0 elements and geometries in 66.3s
Created 2129 quests for bbox in 8.4s

And that is nothing this app can do anything about. Relations are included in the download result. It would be possible to throw away relations that are huge route relations after download. That would decrease the time used to persist and later also to query the data back from the db. On the other hand, the app then becomes unable to do anything with (such) relations. I.e. at least the AddSummitRegister quest will stop working.

westnordost commented 3 years ago

The data returned is http://overpass-turbo.eu/s/16Rk

(
 node(id:7903379312,6931015492,6931015506,7903379311,6931015493,7903379310,6164794952,7903379309,5517629213,6931015503,6164794948,8615285130,6164794958);
  way(id:658305257,740277030,740277033,846977295);
);

out geom;

So that looks correct to me. It takes 172ms on the emulator.

westnordost commented 3 years ago

And if I look for elements in a larger bbox (bbox of house + 40m radius), we already got some pretty large relations in the result:

http://overpass-turbo.eu/s/16Rn

(
  node(id:6931015491,1442012927,6931015490,1442012936,706464955,7903379303,7903379312,6931015492,86092999,6931015506,3707323101,7903379311,6931015493,7903379310,6931015505,6164794942,6164794944,6164794943,6164794949,6164794950,6164794951,6164794952,7903379309,5517629213,6931015503,6164794948,8615285130,6164794958,8455112892,6931015517,7903379306,7903379307,7903379305,6164794959,6164794957);
way(id:10276560,54442135,130986630,131448465,131477211,133764170,147444468,316469812,366748359,658305256,658305257,740277030,740277033,740277034,846977295,860465114,885876130,885876131);
rel(id:1991416,1991442,6249864,6249865,6249868,6249870,7917315,7917320);

);

out geom;

This takes 1211ms on my emulator. You want to try how much that effects your speed? I changed line 158 in OsmQuestController to val paddedBounds = geometry.getBounds().enlargedBy(ApplicationConstants.QUEST_FILTER_PADDING * 2) (added *2)

This time can likely be halved when doing that performance improvement mentioned earlier.

westnordost commented 3 years ago

So anyway, if you solved a quest over at Südosttangente where all the international bus routes run through, the performance problem would be back.

So maybe the better solution (or the one that be done additionally) is to really exclude all relations except a few chosen ones to be persisted. The effects of this would be that the summit register quest needs to be deleted. It is not a big sacrifice IMO.

Could it have any other negative effects? For example when a road is split, potentially all relations the way runs through needs to be updated. If the relations are not available offline, the split would be done "wrong". Well, I think not, because the local split is not connected with the "actual split" that is done when uploading the split. When uploading the split, the algorithm actually queries "give me all relations that reference this way" directly from the OSM API and works with that.

@matkoniecz which are the relation types that should be kept?

matkoniecz commented 3 years ago

The effects of this would be that the summit register quest needs to be deleted. It is not a big sacrifice IMO.

Definitely worth other improvements and avoiding insane performance issues. And maybe it can get back somehow.

The only issue is that it can be potentially triggering some problems in achievement counter, maybe in settings (but it is not the first removed/renamed quest, so...).

@matkoniecz which are the relation types that should be kept?

What about fetching all relation, but without recursive fetching of their elements?

Otherwise associatedStreet is needed but fetching just elements in bbox is fine and ferry quests may want ferry routes - but long ones are discarded anyway, and short ones are unlikely to be mapped as relations...

westnordost commented 3 years ago

What about fetching all relation, but without recursive fetching of their elements?

That was never done. It is the fetching of huge relations alone that is creating these problems.

westnordost commented 3 years ago

Or maybe a negative list? In that negative list, there should be at least network, water, superroute, boundary, route, route_master, waterway.

smichel17 commented 3 years ago

maybe it can get back somehow

What about a small amount of denormalizing? For relations in the DB, also store the number of (known) members, and queries can add a filter to exclude relations of a certain size. Then the summit quest (and any similar quests in the future) could run async in the background, without this restriction (so they may take a while to appear, but that's fine).

Helium314 commented 3 years ago

You want to try how much that effects your speed? I changed line 158 in OsmQuestController to val paddedBounds = geometry.getBounds().enlargedBy(ApplicationConstants.QUEST_FILTER_PADDING * 2) (added *2)

2 tests, for "that house" and a crossing next to bus terminal, below A23

old fetch-by-bbox (download all relations): 17 elements in 138 ms, 113 elements in 44500 ms old fetch-by-bbox with double padding: 61 elements in 586 ms, 205 elements in 52572 ms current fetch-by-bbox: 17 elements in 128 ms, 36 elements in 185 ms current fetch-by-bbox with double padding: 53 elements in 577 ms, 112 elements in 320 ms current master: 39 elements in 216 ms, 62 elements in 258 ms current master with double padding: 71 elements in 663 ms, 137 elements in 355 ms

Regarding routes: @smichel17's suggestion sounds reasonable, then hiking and ferry routes could be allowed in the DB.

Alternatively, how about not discarding hiking/ferry routes at all? Or only those with more than 1000 members? I looked a bit around near several peaks, and if there is a huge hiking route usually there is a shorter one as well (at least for all the peaks I checked).

matkoniecz commented 3 years ago

Or only those with more than 1000 members?

Is it possible to discard sch relations without ever fetching them? ( "It is the fetching of huge relations alone that is creating these problems." https://github.com/streetcomplete/StreetComplete/issues/2803#issuecomment-830681462 )

westnordost commented 3 years ago

Thank you for the tests, @Helium314

These two are important:

current fetch-by-bbox with double padding: 53 elements in 577 ms, 112 elements in 320 ms current master: 39 elements in 216 ms, 62 elements in 258 ms

So, both discard route etc relations. The fetch-by-bbox branch must mitigate that one problem I talked about, which is why a double-padding (or similar) would make sense. With that double padding, it is clear that there is not really any advantage of the approach in the fetch-by-bbox branch, so that can be discarded. The selected solution is then: just discard those relations.

westnordost commented 3 years ago

Ok, I also changed two more things:

performance improvement for quest filter syntax: An expression like amenity ~ restaurant|cafe|fuel|fast_food is now not translated into a regex anymore, but translated into a check in a set of strings. A regex like this is not slow, but a containment check with a set is near instantaneous, O(1). I didn't do very extensive testing but it seemed to have decreased the time of creating quests on download from ~4s to ~2.5s in once instance
adapted the quest auto syncer to work with scored map data element density rather than quest density. that means, if you are in an area that is very densely mapped (cities,...), it downloads in a smaller rectangle than if you are in the open in less densely mapped areas.

westnordost commented 3 years ago

Fetched 37 elements and geometries in 51ms

for "our house" here. I hereby declare this performance problem solved

westnordost commented 3 years ago

And released an alpha version! I hope all the issues have been purged now. But I am not so confident about that, so please continue to report any problem here! :-)

smichel17 commented 3 years ago

This is no longer important, but for reference…

Additionally after having read a bit about CursorWindow

Ack, I had meant to link this article that I found while reading about it, which is from 2017, but I assume still applies: https://medium.com/androiddevelopers/large-database-queries-on-android-cb043ae626e8

In my understanding CursorWindow: Window is full means that the results of the query don't fit into the 2 MB CursorWindow (huge relations!), so another query has to be done (and for some reason this is horribly slow).

This was also my take-away from the article, and that the cursor windows may overlap (when you move to a position, it starts filling the window before that position, not at the position). Also that you can adjust that behavior and the cursor size on api 28+, but earlier than that the only way to optimize is a custom implementation.

westnordost commented 3 years ago

But tis does not occur for StreetComplete. Even a relation with 1000000 members would not trigger that it does not fit into one CursorWindow because each relation member is one row in the database table.

Helium314 commented 3 years ago

Only the problem that one row doesn't fit into the CursorWindow can't occur.

But the problem is that the entire set of results (mostly rows of the relation members table) is larger than the CursorWindow. This is made worse by the CursorWindow containing 33% old data that were already read, so after "CursorWindow is full" only 1.3 MB of new data are fetched from the DB.

Additional useful link/answer: https://stackoverflow.com/questions/20094421/cursor-window-window-is-full/52505989#52505989 In here it is stated that this is normal DB operation. But why does it take several seconds until the next message / load from DB? If the slow part is only processing the 1.3 MB of new data, then maybe enum.valueOf is that slow (https://github.com/streetcomplete/StreetComplete/issues/2803#issuecomment-830223859) Or maybe all returned rows are processed, including the 1/3 old ones? But I doubt that...

I actually wanted to increase the CursorWindow size (possible on Android 9+) to test how this affects speed, but I don't understand how to actually do this.

westnordost commented 3 years ago

But why does it take several seconds until the next message / load from DB?

Maybe the RAM is not enough and there is some swapping going on? Otherwise, no idea.

smichel17 commented 3 years ago

I actually wanted to increase the CursorWindow size (possible on Android 9+) to test how this affects speed, but I don't understand how to actually do this.

I have not done it myself, but it looks something like this:

Wherever we want to run the query (I guess in our AndroidDatabase.kt), create a SQLiteQueryBuilder.
Use its setCursorFactory method to set a custom factory.
- Note: Since CursorFactory interface has only a Single Abstract Method, you can use Kotlin's SAM interface conversion syntax to pass a lambda instead and save some boilerplate.
The custom factory should return a regular SQLiteCursor, but call setFillWindowForwardOnly and setWindow to change the window.
The new window should be a regular CursorWindow, but constructed with the version that takes a window size as the argument.

It seems like this still requires replacing the cursor window after the first query has already been run (ie, refreshing the cursor using the new window), but if it was filling up several times then this should still be an improvement.

streetcomplete / StreetComplete

[v32.0-alpha] answering quests sometimes really slow #2803

Imitate OSM Api

All relations except a few don't get a bbox