Performance suggestion for `random_task` view

hotosm / osm-tasking-manager2

Designed and built for Humanitarian OpenStreetMap Team collaborative emergency/disaster mapping, the OSM Tasking Manager 2.0 divides an area into individual squares that can be rapidly mapped by thousands of volunteers.

http://tasks.hotosm.org

Other

425 stars 156 forks source link

Performance suggestion for `random_task` view #972

Open risicle opened 7 years ago

risicle commented 7 years ago

Taking a browse of the code and wondering why the "pick a random task" feature can get so slow during a mapping party it occurred to me that you could save a lot of db load by denormalizing and maintaining an adjacency table for tasks rather than performing expensive geometry operations in-query. Supposing a m2m relationship called neighbours, locked_filter would become something along the lines of:

locked_filter = Task.neighbours.has(locked=True)

which is a join postgres can do incredibly fast.

If you wanted to go a step further, it would be easily possible to denormalize the "in priority area" information into a boolean field on Task and entirely remove expensive geometry operations from this view.

As for how to maintain the denormalized data, there are many ways to skin that cat.

(also I found this project's code pleasingly clear so :+1: there)

pgiraud commented 7 years ago

Hi @risicle Thanks a lot for the analysis and the relevant suggestions. Can you please elaborate about the performance issues? When did you go to a mapathon for the last time? Which project were you working on? Though I agree with your suggestions, I checked the performances using the pyramid debugging tools and I don't see much load on the database. So I wonder if this is really worth.

What I would like to know particularly is whether you encountered performance issues before or after the switch to the new server.

risicle commented 7 years ago

Yup, I of course don't have the ability to analyze a running system with a real life workload, so my analysis can really be taken with a pinch of salt. I've noticed speed to be an issue with the "random task" function quite often. Most recently on task 2525 this last tuesday evening, the 7th - a "random task" request was taking, maybe, 30 seconds? (leading many users to of course do the "is it doing anything? i'll click again..." thing which would exacerbate it.

But of course, if that is not a bottleneck in reality this would be a pointless overcomplication to add, what with all the housekeeping needed...

When was the switch to the new server?

risicle commented 7 years ago

(have you set up postgres to log slow queries?)

ethan-nelson commented 7 years ago

(I know very little about PostGIS so I don't have anything substantive to add, but I will point out project 2525 specifically is a bit on the extremely high side for number of tasks in a project.)

risicle commented 7 years ago

Indeed, however it isn't the first time I've noticed this problem.

If we were to ignore the pre-building-adjacency-lists route for a moment (due to the moderate amount of implementation pain), I'm assuming the current, build-a-big-union-and-then-check-disjoint-against-that technique is deliberately being used over a more obvious disjoint-join-condition technique for performance reasons. I would have thought the latter would be faster as it might be able to make use of a spatial index.

Using that approach would be as simple as adding a neighbours relationship to Task something like:

neighbours = relationship(
    "Task",
    primaryjoin=lambda: and_(
        foreign(project_id) == remote(project_id),
        foreign(geometry).ST_Intersects(remote(geometry)),
    ),
    viewonly=True,
)

and again in the view this would make the locked_filter definition in the view as simple as

locked_filter = Task.neighbours.has(locked=True)

(but I haven't tried this so there may be some gotcha lurking in the above!)

pgiraud commented 7 years ago

When was the switch to the new server?

It was done on Feb 9th. If some of the request took > 30s to get a response it's most likely a problem with the old server being overloaded by other applications.

As far as I know no one complained about performances issues after the switch.

Anyway, thanks for your code snippet. Unfortunately, it doesn't work as is. The Task table doesn't have any locked property. Instead there's a cur_lock relationship with which we can get the last Lock record for the given task.

risicle commented 7 years ago

Oh of course, I glossed over that extra join - it would probably be something more like, keeping the current flow of the view:

locked_filter = Task.neighbours.any(Task.id.in_(DBSession.query(Task.id).join(Task.cur_lock).filter(TaskLock.locked == True).subquery()))

(might TaskLock need use of aliased? is it clear I'm from more of a manual-sql background?)

which is quite ugly so I personally would probably alter the behaviour of the view somewhat so that locked_filter and friends are lambdas that when called with to the existing query object return a filtered version of it. This would allow the whole thing to be done as a join instead of the awkward subquery above.

Anyway, if it's not currently a problem, it's not currently a problem, but I would recommend enabling slow query logging in case performance hiccup is reached in the future.

(also in my above relationship listing, I accidentally suggested joining Task.id against Task.id, so have edited it now to join on the project_id. duh.)