Closed westnordost closed 3 years ago
So this is how the data flow currently looks now in my development branch. The graph is not complete of course:
The "visible quests" (i.e. those that are displayed on the map) are now just an end product and not (necessarily) persisted themselves in the DB). Rather, the OSM elements and OSM notes are "first class citizens", the quests are just generated from this.
For notes, there is not even a OsmNoteQuestsTable
anymore, the note quests are generated on-demand from the notes, from the information which note quests are hidden, which note quests have been answered (CommentNote
table) and other things.
Since it is quite expensive to generate quests from raw OSM data, there is still a OsmQuestsTable
. It is updated each time anything changes in the OSM data.
OsmQuestSplitWayDao
, DeleteOsmElementDao
, UndoOsmQuestDao
etc are updated through QuestController
(not pictured) which makes sure that the quests are also updated (hidden/removed/...). It is okay like this now, but actually, changes made on the OSM elements should be managed by OsmElementController
instead.I seem unable to find a "compromise-point" at which to stop refactoring, also in respect to that I'll have to go the whole way anyway very soon (in M3). This is the intermediate state.
Colors don't really mean anything, they are just for myself: Blue means that I already added unit tests for that, yellow means that unit tests are missing, red means that an implementation is missing. Green means that it is an interface, thus no tests are necessary.
So what is still open is that in the current setup, when a quest is solved by f.e. splitting a way, leaving a note or answering it normally, the quests (OsmQuestController
) are not updated.
The optimal case (that should be reached for M3) is that the changes made through answering the quest should be directly applied to the local elements in some kind of working set and it is that working set the OsmQuestController
is listening to for updates in quests that should be displayed.
Still work in progress...
Advancing only slowly. Reworked how the edits are stored and uploaded. Next step is to put a class in between MapDataSource and OsmQuestController that aggregates data from MapDataController (=data from OSM API) and ElementEditsController (=local changes to this data that haven't been uploaded yet).
And what is also missing is to do the same for notes.
So, refactored the notes-part now. This by the way makes it technically possible to create a note, then comment several times on it. All offline. The edited note (=note with added comment) could also be shown in the UI as if the comment has already been uploaded. This functionality is of course not used yet, but it becomes possible.
Now, I need to write a lot of tests, then test it in the app.
As an optional extension / next refactor, there could be another class through which the edit history could be shown and all the things undone in any order. All edits that have not been uploaded yet will then be undoable, edits that have already been uploaded require special implementation to be revertable (I.e. normal quest answers / tag changes are revertable, the rest not at the moment. There is a ticket in the issue tracker to enable reverting deletions.)
Status update:
Currently a download looks like this. 11.3 seconds for 34152 elements and 7 notes. Some of the tasks below run in parallel, denoted by the starting letter. (different leter = runs parallel, same letter = runs serialized)
A NotesDownload: Downloaded 7 notes in 0.1s
A NoteController: Persisted 6 and deleted 0 notes in 0.0s
B MapTilesDownload: Downloaded 36 tiles (509kB downloaded, 976kB already cached) in 1.0s
A OsmAvatarsDownload: Downloaded 5 avatar images in 1.0s
C MapDataDownload: Downloaded 29667 nodes, 4372 ways and 150 relations in 4.8s
C MapDataController: Persisted 34152 and deleted 0 elements and geometries in 3.9s
C OsmQuestController: Created 2246 quests for bbox in 2.0s
C OsmQuestController: Persisted 2115 new and removed 0 already resolved quests in 0.1s
Finished download in 11.3s
There are two things that may be improved in terms of download speed:
Date
parsing takes up about 50% of all the parsing time for a map data response. Maybe this could be sped up by not parsing the date at all or parsing it in some faster fashion. That could cut the download time in (up to) half.So in total, it looks like the download time could be reduced by about 50% (maximum).
Ok, implemented option 1. For some random test location (my location), the download completes after 8.1 seconds with current master and 8.8 seconds in the dev branch. I think that's okay. Though, times on different times differed a lot: If the download hits some Java memory limits, it gets much slower (one time, it was 70s or so for the same location). Since for the version on the dev branch, a lot more objects are persisted, I think the potential is higher to hit some limits there.
Option 2 might cut this time further into half but would require to
osmapi
-library to do no Date
parsing but instead let the date parsing be done by the MapDataFactory
Date
s are used. This includes the OSM elements (nodes, ways, relations), so it makes sense to combine that with exchanging data structures from the osmapi library with pure kotlin data structures, see https://github.com/streetcomplete/StreetComplete/issues/1892#issuecomment-785839848All in all, to do it properly, it's quite a stack of things to refactor. Certainly, some things can be done before and some later, but the full potential is only unravelled when all these are done. So, I think I'd postpone this to some unknown time in the future, unless someone would like to help me here.
Just to summarize the advantages of doing this:
Date
-parsing with kotlinx-datetime: possibly almost 100% faster download times, one step toward making an iOS port feasibleI am just going to ping the usual suspects here, also those that are interested in an iOS version. Maybe if we did this together, we could refactor this quite quickly: @FloEdelmann @arrival-spring @wtimme @matkoniecz @ENT8R . Let me know, any of you, also if I didn't ping you here, if you are interested in working on this together.
The time to do this is convenient because the whole database scheme changed for this dev version already, so users' data will need to be dropped anyway when upgrading to this version. If the above is done, this would probably happen again, so it would be nice to limit that possible data loss to only one time.
@westnordost I'm willing to help but I don't have much time this week. Next week should be better though :)
That's fantastic! Sure, next week is fine because I stumbled upon some serious performance problem in the dev branch plus an architectural issue I have to take care of (first) anyway.
I got lost trying to make the whole of StreetComplete based on suspending coroutines. I went back from that a bit, it was too much change in one go.
What I did was to replace the download and the upload with coroutine/suspending code, to get rid of the AtomicBoolean as a cancel flag, cleaned up how coroutines are currently used in the codebase etc.
I went back on already creating the quests while persisting because this can theoretically cause inconsistencies. Need to think about if I should do this and what measures I take to prevent this.
But anyway, I implemented #2521 and made all bulk insertions into the database about 2x faster (nodes, ways, way members, relations, relation members, geometries, osm quests etc etc). Inspired by this medium post https://medium.com/@JasonWyatt/squeezing-performance-from-sqlite-insertions-971aff98eef2 Bundling inserts in transactions, it's already about 10x faster than without. But I did use this already. But using prepared insert-statements in a transaction, it is yet almost 2x faster.
A NotesDownload: Downloaded 8 notes in 0.2s
A NoteController: Persisted 7 and deleted 0 notes in 0.0s
B MapTilesDownload: Downloaded 36 tiles (510kB downloaded, 978kB already cached) in 0.8s
A OsmAvatarsDownload: Downloaded 5 avatar images in 1.2s
C MapDataDownload: Downloaded 29710 nodes, 4378 ways and 150 relations in 2.9s
C MapDataController: Persisted 34201 and deleted 0 elements and geometries in 2.4s
C OsmQuestController: Created 2208 quests for bbox in 1.8s
C OsmQuestController: Persisted 2079 new and removed 0 already resolved quests in 0.1s
Finished download in 7.3s
I am in no hurry to get this refactor-version out, because it is a big architectural break which effects the users in a negative way (all non-uploaded quests will be deleted once), so I am still collecting more functionality/refactor work until this gets released.
In particular, the refactor mentioned in the last post(s) would be nice, but it is yet another thing that is quite big, so I won't start this for now without help.
What I will start next would be a rework of the undo functionality becuase that is a loose end anyway (it crashes when one tries to undo something).
which effects the users in a negative way (all non-uploaded quests will be deleted once),
Can we release a version before it which informs everyone of this? With a scheduled release date for the refactoring, which the app would warn before.
Or, maybe the first X relases with the new architecture can include the old uploader and do a one-time task of uploading quests (require this in order to proceed to use the app)
Can we release a version before it which informs everyone of this
Yes, I can put a warning in the changelog for v31.1
Or, maybe the first X relases with the new architecture can include the old uploader and do a one-time task of uploading quests (require this in order to proceed to use the app)
Won't work, because the issue is not the uploader, but the database. The database pretty much changed completely. I tried to write migrations, but it was so much code that I dropped the effort.
I was thinking something like this:
v32MigrationActivity
that expects the old DBUnlike writing migrations, it's not so much new code to write, so hopefully lower effort. It might be a lot of existing code that needs to be kept around in order to make this approach work; at some point it would make sense to drop it. But it would be nice to keep in for a few versions, I think.
Still too much effort, in my opinion.
I don't expect that the edits are kept around for weeks, usual use case would be that it's kept for a day and in the evening or something, it is uploaded.
Still too much effort, in my opinion.
Can't you just give the new tables new names, and therefore have both in parallel, along with the old upload code in an old class?
I don't expect that the edits are kept around for weeks, usual use case would be that it's kept for a day and in the evening or something, it is uploaded.
Yeah that's probably true.
Yes, I can put a warning in the changelog for v31.1
I saw the warning in the release, personally I think it needs to be a bit stronger and should suggest people turn off automatic upgrades until this release has happened, otherwise there is a risk of data loss if they do some surveying then close the app before they get back online.
Is it also worth adding a nag screen if you're closing the app with data which hasn't been uploaded explaining this is at risk with the new database change?
Presumably people with older versions are at a greater risk as they won't see the changelog beforehand, but perhaps that's quite a small proportion of people? Maybe add some notes on the app stores too?
Let me know, any of you, also if I didn't ping you here, if you are interested in working on this together.
@westnordost Now I have some free time and would be able to work on this. Is there something I could work on?
Yes! There is a branch no-osmapi-data. It contains the new "pure kotlin" data classes.
Note
MapDataDao
, NotesDao
etc that usually return classes from osmapi should be wrapped by classes named MapDataApi
, NotesApi
and these wrapper classes can have the same (or a more convenient) interface as the original ones, only that they return the pure kotlin data classesIt looks like the implementation is complete now. Minus the stuff that @FloEdelmann started, mentioned in https://github.com/streetcomplete/StreetComplete/issues/2506#issuecomment-792804054
What is also missing is the new "history" feature, but the basic refactoring is done.
However, before I release the first alpha for those who would like to help testing, I think I will wait until the further restructuring that @FloEdelmann and me are doing is done.
Changelog for v32.0 so far:
The grant from the Prototype Fund
ended in February, however, there is one last big feature I can delight you with before development
on this app must necessarily wind down:
It may not look like a lot, but I basically replaced the whole architecture of the app (#2506, ...)
to do this.
anyway, anything concering the technical refactoring that isn't in other tickets yet is now done. So I am closing this ticket.
The "Layers" feature will probably not be implemented soon. I mentioned it: The fund is over since February and I'll not be able to invest so much time in this project anymore. There'd still be a big chunk left to do to implement that feature - basically the whole UI. So, what will happen is still this update mentioned above, plus some minor things not mentioned there yet (see the milestone "elements-first-class"): https://github.com/streetcomplete/StreetComplete/milestone/14
I appreciate the work you've done here. Adding a feature -- even a very involved feature like the layers UI -- is much more feasible than a total overhaul of the data model. So, I think this was a good use of time, to get the foundational work complete, even if the feature itself is too much to implement right now.
@westnordost I'm somewhat confused; will the the tickets mentioned above then be released soon(-ish), eg. in v32.0, or are they also being postponed indefinitely ?
In any case; much thanks for all your hard work!
@mnalis this things are already done (unless I am really confused) so - it is not indefinitely postponed
Thanks very much for this @westnordost . The offline stuff will certainly be great, but I'm really looking forward to this:
* The above also works when splitting a way. The quests for the split segments will be shown immediately
From my experience even when I've had a connection it seems to often take a little while to upload and be processed within OSM before I can retry a query and then actually see and answer the new quest.
For the Layers #2461 concept, there is some deep code restructuring that has to be made. Here is an attempt to summarize it.
Summary
Milestones
M1
Download
On download of OSM data, element geometry for all downloaded elements must be created and persisted. Currently, this is only done for any elements that get a quest.
In detail:
QuestDownloader
toOsmDownloader
or similar. Generalize it. Put the notes download into own classOsmApiQuestDownloader
should not download OSM data by itself (so, rename) but having it handed to it. Maybe byOsmDownloader
or maybe there is some listener that fires whenever data in the OSM elements table changed and theOsmApiQuestDownloader
is informed of that, updates the quests based on that. Maybe there is an oppurtunity to merge the logic inOsmApiQuestDownloader
withOsmQuestGiver
Data Cleanup
A last update timestamp for each element must be added to be able to periodically dump old data. Based on that, data older than X days not referenced by an action must be dumped.
In detail:
Data access
The persisted OSM data must be accessible efficiently. Thanks to the
ElementGeometryTable
, this is basically already possible. The appropriate DAO(s)MergedElementDao
(?) just needs a new method and/or database View with index to do that.Element update
Whenever an element is updated (after upload), quests are created/removed for that element. Now that all downloaded data is available offline,
isApplicableTo
andgetApplicableElements
could maybe be merged somehow.In detail:
MapDataWithGeometry
from database, or should it be handed to it? What data (in what radius) should it contain?M2
TODO TBD
M3
TODO TBD