Closed Helium314 closed 3 years ago
Anyway, I guess there are two possibilities how to solve this:
When fetching all elements within a bbox, it actually means to 1. fetch all nodes within that bbox, 2. fetch all ways that have one of these nodes and 3. fetch all relations that have either any of these nodes or ways as a member.
This solves the problem that extremely spacious relations that thus have a very large bbox (and it looks like there are many spacious relations) are only included if there is actually a member of it in the bbox itself.
Also, as it mimics how the OSM Api download map bbox call works, it is consistent.
Downside is, that the case described in https://github.com/streetcomplete/StreetComplete/issues/2803#issuecomment-827639718 becomes more likely.
So if when persisting relations, only a few relation types, like multipolygons, will also persist their bbox, this results in the fetch-elements-in-bbox call to never return these relations. Those relations still remain in the DB and they can be fetched for things like splitting the way (and the necessary update of certain relations). Just for re-checking quest eligibilities, they are not available.
The advantage is that the behavior we have now can mostly stay like this. The downside is, that quests that look for (non-multipolygon) relations will not work anymore. Or more precisely, will still work on downloading, but when rechecking quests, the quests will vanish.
One quest that comes to mind is the AddSummitRegister
quest.
Anyway, which quests are affected by this anyway?: AddSummitRegister
, AddCycleway
and AddSidewalk
.
I should note that the general problem as shown here...
...already exists anyway. Only just that the download bounding box is much larger than the "quest recheck box" and thus this issue surfaces much less frequently.
that the download bounding box is much larger than the "quest recheck box" and thus this issue surfaces much less frequently
Maybe this suggests a 3rd variant: the osm api approach, but using a larger bbox as a buffer, where data is checked/downloaded but not itself used to generate/show quests.
For example, if we had a "cycleway width" quest, in this diagram it would not be shown because the cycleway nodes are outside the quest scan bbox, but those nodes would still be used to exclude the road from the cycleway quest, because they're inside the buffer.
How big should the buffer be?
Am 30. April 2021 02:22:48 MESZ schrieb smichel17 @.***>:
that the download bounding box is much larger than the "quest recheck box" and thus this issue surfaces much less frequently
Maybe this suggests a 3rd variant: the osm api approach, but using a larger bbox as a buffer, where data is checked/downloaded but not itself used to generate/show quests.
For example, if we had a "cycleway width" quest, in this diagram it would not be shown because the cycleway nodes are outside the quest scan bbox, but those nodes would still be used to exclude the road from the cycleway quest, because they're inside the buffer.
-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/streetcomplete/StreetComplete/issues/2803#issuecomment-829713533
-- Diese Nachricht wurde von meinem Android-Gerät mit K-9 Mail gesendet.
What about option of doing what OSM API does
- fetch all nodes within that bbox, 2. fetch all ways that have one of these nodes and 3. fetch all relations that have either any of these nodes or ways as a member.
and add an extra step "fetch full geometries of all ways (including intersecting ones, without node inside!)", without fetching full geometry of all relations?
Anyway, which quests are affected by this anyway?: AddSummitRegister, AddCycleway and AddSidewalk.
And AddStreetName that excludes elements within associated street relation.
And AddFerryAccessPedestrian, AddFerryAccessMotorVehicle ( https://github.com/streetcomplete/StreetComplete/tree/master/app/src/main/java/de/westnordost/streetcomplete/quests/ferry ) - but ferry routes usually have geometries that are not insane
AddHousenumber - https://github.com/streetcomplete/StreetComplete/blob/master/app/src/main/java/de/westnordost/streetcomplete/quests/housenumber/AddHousenumber.kt#L157 (what may be searched for here?)
The advantage is that the behavior we have now can mostly stay like this. The downside is, that quests that look for (non-multipolygon) relations will not work anymore. Or more precisely, will still work on downloading, but when rechecking quests, the quests will vanish. One quest that comes to mind is the AddSummitRegister quest.
Would it be possible to assume that their state is unchanged and protect them from vanishing? For example AddSummitRegister once generated is not going to be invalidated by edits within SC. The same would be true for AddCycleway, AddSidewalk, AddFerryAccessMotorVehicleAddFerryAccessPedestrian
AddStreetName is sad case, as it should be created by AddHousenumber - but associated street relations should be safe to fetch. Though there is a risk of megastreet with 20 000 houses on it... But on the other hand, for associated street fetching members just within bounding box is sufficient!
Would it be feasible to only put elements in the DB that are actually used in any quest? E.g. street lamps and boundaries are not used (I think)
Haha, well yes, that is feasible: Just use v31 and earlier. The whole point of the architecture change made for v32 was that the OSM data is kept independent of quests and quests are just generated from these.
I was still curious and tried excluding relations containing boundary= or route= from being put in the DB. Now the DB is half size after initial download (11.61 MB instead of 22.47), and when answering building type quest for "that house" 142 elements are fetched in 305 ms
Increasing DB size to 47 MB by downloading more areas slows things down a bit, but fetching still takes less than a second.
Anyway, which quests are affected by this anyway?: AddSummitRegister, AddCycleway and AddSidewalk.
And [...]
I meant affected by the issue that sometimes, ways that are near other ways are not found due to not having a vertex in the same bbox.
Hm, 11 MB just for those relations? There is not that much data on relations. One relation member in the table is
So a relation with 1000 members will have about 32kb.
@Helium314 I wonder, how does it affect the size in the db if the element type is not persisted as a string but as a number. I.e. in RelationDao,
line 51:
member.type.ordinal
line 80:
ElementType.values()[cursor.getInt(TYPE)]
line 123:
args = arrayOf(elementType.ordinal)) { it.getLong(ID) }.toSet()
and then in RelationTables, change line 35:
${Columns.TYPE} int NOT NULL,
normal DB: 291084 relation_members 420 relations
without routes and boundaries: 1953 relation_members 207 relations
So a relation with 1000 members will have about 32kb.
then 290000 members are about 9.3 MB, not so far from 11 MB
@Helium314 I wonder, how does it affect the size in the db if the element type is not persisted as a string but as a number. I.e. in RelationDao,
Not much effect, still 21.61 MB. Answering "that quest" was somewhat faster: 40 sec instead of the usual 43
You mean whether saving this as a string or an int has not much of a space impact but about 10% speed impact? That's sounds a bit too much. enum.valueOf
can't be that slow.
Maybe just measurement noise? Is this difference persisting if you say restart phone and retry?
It might be noise, but in the recent attempts I was always very close to 43 seconds, that's why I mentioned it.
Additionally after having read a bit about CursorWindow, I would expect this to be faster:
In my understanding CursorWindow: Window is full
means that the results of the query don't fit into the 2 MB CursorWindow (huge relations!), so another query has to be done (and for some reason this is horribly slow).
With a smaller DB, more rows fit into the CursorWindow. I got only 8 CursorWindow: Window is full
message instead of the usual 9, which also points towards a small DB being a (small) improvement
retry after reboot: 39.6 sec
Okay, let's not pursue that further. IMO it does not make a big difference whether to store it as a byte or as a string like "WAY"
on the fetch-by-bbox
branch I changed the method how the bbox is fetched (= mimic OSM API).
Do you want to try it out?
There is still something to do, but I first wanted to check if it makes any difference.
What's left to do if this works out:
Also, the performance problem currently regarding the huge relations should also lead to very very slow download too, because also all geometry in the bbox a new download happens is fetched from DB (so also those huge relations). This change also addresses that.
Using fetch-by-bbox
fetches 17 elements in 150ms for the same house, so definitely a difference.
Heh, so it's 300 times faster?
Am 1. Mai 2021 08:30:08 MESZ schrieb Helium314 @.***>:
Using
fetch-by-bbox
fetches 17 elements in 150ms for the same house, so definitely a difference.-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/streetcomplete/StreetComplete/issues/2803#issuecomment-830563914
-- Diese Nachricht wurde von meinem Android-Gerät mit K-9 Mail gesendet.
Those 17 (instead of 100+) elements probably don't contain any of those huge relations...
Though 17 sound too little. JOSM fetches 29
Just as a side note, that particular area is really problematic because there are so many huuuuuuge relations. Download of that area for me was like:
Downloaded 33185 nodes, 4774 ways and 420 relations in 48.4s
Persisted 38263 and deleted 0 elements and geometries in 66.3s
Created 2129 quests for bbox in 8.4s
And that is nothing this app can do anything about. Relations are included in the download result. It would be possible to throw away relations that are huge route relations after download. That would decrease the time used to persist and later also to query the data back from the db.
On the other hand, the app then becomes unable to do anything with (such) relations. I.e. at least the AddSummitRegister
quest will stop working.
The data returned is http://overpass-turbo.eu/s/16Rk
(
node(id:7903379312,6931015492,6931015506,7903379311,6931015493,7903379310,6164794952,7903379309,5517629213,6931015503,6164794948,8615285130,6164794958);
way(id:658305257,740277030,740277033,846977295);
);
out geom;
So that looks correct to me. It takes 172ms on the emulator.
And if I look for elements in a larger bbox (bbox of house + 40m radius), we already got some pretty large relations in the result:
http://overpass-turbo.eu/s/16Rn
(
node(id:6931015491,1442012927,6931015490,1442012936,706464955,7903379303,7903379312,6931015492,86092999,6931015506,3707323101,7903379311,6931015493,7903379310,6931015505,6164794942,6164794944,6164794943,6164794949,6164794950,6164794951,6164794952,7903379309,5517629213,6931015503,6164794948,8615285130,6164794958,8455112892,6931015517,7903379306,7903379307,7903379305,6164794959,6164794957);
way(id:10276560,54442135,130986630,131448465,131477211,133764170,147444468,316469812,366748359,658305256,658305257,740277030,740277033,740277034,846977295,860465114,885876130,885876131);
rel(id:1991416,1991442,6249864,6249865,6249868,6249870,7917315,7917320);
);
out geom;
This takes 1211ms on my emulator.
You want to try how much that effects your speed? I changed line 158 in OsmQuestController to
val paddedBounds = geometry.getBounds().enlargedBy(ApplicationConstants.QUEST_FILTER_PADDING * 2)
(added *2)
This time can likely be halved when doing that performance improvement mentioned earlier.
So anyway, if you solved a quest over at Südosttangente where all the international bus routes run through, the performance problem would be back.
So maybe the better solution (or the one that be done additionally) is to really exclude all relations except a few chosen ones to be persisted. The effects of this would be that the summit register quest needs to be deleted. It is not a big sacrifice IMO.
Could it have any other negative effects? For example when a road is split, potentially all relations the way runs through needs to be updated. If the relations are not available offline, the split would be done "wrong". Well, I think not, because the local split is not connected with the "actual split" that is done when uploading the split. When uploading the split, the algorithm actually queries "give me all relations that reference this way" directly from the OSM API and works with that.
@matkoniecz which are the relation types that should be kept?
The effects of this would be that the summit register quest needs to be deleted. It is not a big sacrifice IMO.
Definitely worth other improvements and avoiding insane performance issues. And maybe it can get back somehow.
The only issue is that it can be potentially triggering some problems in achievement counter, maybe in settings (but it is not the first removed/renamed quest, so...).
@matkoniecz which are the relation types that should be kept?
What about fetching all relation, but without recursive fetching of their elements?
Otherwise associatedStreet is needed but fetching just elements in bbox is fine and ferry quests may want ferry routes - but long ones are discarded anyway, and short ones are unlikely to be mapped as relations...
What about fetching all relation, but without recursive fetching of their elements?
That was never done. It is the fetching of huge relations alone that is creating these problems.
Or maybe a negative list? In that negative list, there should be at least network, water, superroute, boundary, route, route_master, waterway.
maybe it can get back somehow
What about a small amount of denormalizing? For relations in the DB, also store the number of (known) members, and queries can add a filter to exclude relations of a certain size. Then the summit quest (and any similar quests in the future) could run async in the background, without this restriction (so they may take a while to appear, but that's fine).
You want to try how much that effects your speed? I changed line 158 in OsmQuestController to
val paddedBounds = geometry.getBounds().enlargedBy(ApplicationConstants.QUEST_FILTER_PADDING * 2)
(added *2)
2 tests, for "that house" and a crossing next to bus terminal, below A23
old fetch-by-bbox (download all relations): 17 elements in 138 ms, 113 elements in 44500 ms old fetch-by-bbox with double padding: 61 elements in 586 ms, 205 elements in 52572 ms current fetch-by-bbox: 17 elements in 128 ms, 36 elements in 185 ms current fetch-by-bbox with double padding: 53 elements in 577 ms, 112 elements in 320 ms current master: 39 elements in 216 ms, 62 elements in 258 ms current master with double padding: 71 elements in 663 ms, 137 elements in 355 ms
Regarding routes: @smichel17's suggestion sounds reasonable, then hiking and ferry routes could be allowed in the DB.
Alternatively, how about not discarding hiking/ferry routes at all? Or only those with more than 1000 members? I looked a bit around near several peaks, and if there is a huge hiking route usually there is a shorter one as well (at least for all the peaks I checked).
Or only those with more than 1000 members?
Is it possible to discard sch relations without ever fetching them? ( "It is the fetching of huge relations alone that is creating these problems." https://github.com/streetcomplete/StreetComplete/issues/2803#issuecomment-830681462 )
Thank you for the tests, @Helium314
These two are important:
current fetch-by-bbox with double padding: 53 elements in 577 ms, 112 elements in 320 ms current master: 39 elements in 216 ms, 62 elements in 258 ms
So, both discard route etc relations. The fetch-by-bbox branch must mitigate that one problem I talked about, which is why a double-padding (or similar) would make sense. With that double padding, it is clear that there is not really any advantage of the approach in the fetch-by-bbox branch, so that can be discarded. The selected solution is then: just discard those relations.
Ok, I also changed two more things:
amenity ~ restaurant|cafe|fuel|fast_food
is now not translated into a regex anymore, but translated into a check in a set of strings. A regex like this is not slow, but a containment check with a set is near instantaneous, O(1). I didn't do very extensive testing but it seemed to have decreased the time of creating quests on download from ~4s to ~2.5s in once instanceFetched 37 elements and geometries in 51ms
for "our house" here. I hereby declare this performance problem solved
And released an alpha version! I hope all the issues have been purged now. But I am not so confident about that, so please continue to report any problem here! :-)
This is no longer important, but for reference…
Additionally after having read a bit about CursorWindow
Ack, I had meant to link this article that I found while reading about it, which is from 2017, but I assume still applies: https://medium.com/androiddevelopers/large-database-queries-on-android-cb043ae626e8
In my understanding CursorWindow: Window is full means that the results of the query don't fit into the 2 MB CursorWindow (huge relations!), so another query has to be done (and for some reason this is horribly slow).
This was also my take-away from the article, and that the cursor windows may overlap (when you move to a position, it starts filling the window before that position, not at the position). Also that you can adjust that behavior and the cursor size on api 28+, but earlier than that the only way to optimize is a custom implementation.
But tis does not occur for StreetComplete. Even a relation with 1000000 members would not trigger that it does not fit into one CursorWindow because each relation member is one row in the database table.
Only the problem that one row doesn't fit into the CursorWindow can't occur.
But the problem is that the entire set of results (mostly rows of the relation members table) is larger than the CursorWindow. This is made worse by the CursorWindow containing 33% old data that were already read, so after "CursorWindow is full" only 1.3 MB of new data are fetched from the DB.
Additional useful link/answer: https://stackoverflow.com/questions/20094421/cursor-window-window-is-full/52505989#52505989 In here it is stated that this is normal DB operation. But why does it take several seconds until the next message / load from DB? If the slow part is only processing the 1.3 MB of new data, then maybe enum.valueOf is that slow (https://github.com/streetcomplete/StreetComplete/issues/2803#issuecomment-830223859) Or maybe all returned rows are processed, including the 1/3 old ones? But I doubt that...
I actually wanted to increase the CursorWindow size (possible on Android 9+) to test how this affects speed, but I don't understand how to actually do this.
But why does it take several seconds until the next message / load from DB?
Maybe the RAM is not enough and there is some swapping going on? Otherwise, no idea.
I actually wanted to increase the CursorWindow size (possible on Android 9+) to test how this affects speed, but I don't understand how to actually do this.
I have not done it myself, but it looks something like this:
setCursorFactory
method to set a custom factory.
setFillWindowForwardOnly
and setWindow
to change the window.It seems like this still requires replacing the cursor window after the first query has already been run (ie, refreshing the cursor using the new window), but if it was filling up several times then this should still be an improvement.
After downloading a large area containing many quests, answering some quests takes very long (in my case this was done automatically after first app start, but also happens when zooming out and starting a manual scan). Especially for building or crossing quests, on my S4 Mini it takes some 30 seconds to for the pin to be removed. In this time I can't open any other quests, SC usage is basically blocked (panning/zooming the map is still possible without noticable slowdowns). Other quests like opening hours do not have this problem. Interestingly, answering quests near the southern edge of the scanned area is much faster than in other regions.
When "leaving" this area, i.e. scrolling somewhere far away, manually scanning and answering quests is fast (less than a second). The southern edge of the new area might be a little bit faster, not sure.
Sometimes after doing stuff "somewhere else", anwering quests inside the initial scan area is faster (maybe 10 seconds instead of 30 for the same element), but still slower than outside. I haven't yet found a way of reproducing this.