Requests' response time from server take longer as the limit increases

lduarte1991 commented 9 years ago

There's a huge difference in response time when requesting larger limits and I'm unclear why that is. I made the following request via curl (https://catch.harvardx.harvard.edu/catch/annotator/search?limit=500&offset=0&uri=https%3A%2F%2Fcourses.edx.org%2Fcourses%2FVJx%2FVJx%2F3T2014%2Fcourseware%2Fa3b18702fd42405683755a8acf297f68%2F0df13b2c7af540bf988274e78c19277c%2F3&media=image&userid&username&tag&text) and tried to change the limit from 50 up to 500. I did it 5 times per limit and the average is below:

Limit -> response time 50 = 0.88s 200 = 1.68s 300 = 2.25s 350 = 2.33s 400 = 2.76s 500 = 3.03s

arthurian commented 9 years ago

@lduarte @jmiranda I did a little investigating of this issue and I believe the parent condition in the search query is causing the poor performance. Here's the line in question:

AnnotatorService.groovy#L241 which searches Annotation.groovy

Looking at the SQL queries in the log when I do a search with a single contextId parameter (i.e. get all annotations for a course with the given context ID), I'm seeing this pattern:

A single SQL query to get the total count of annotations.
A single SQL query to get all the annotations for a given context ID.
A total of N SQL queries to lookup the parent annotation of each annotation (if any), where N is the number of annotations found in the previous step.

Example queries from log:

SELECT Count(*) AS y0_ 
FROM   annotation this_ 
WHERE  ( ( this_.deleted = ? 
            OR this_.deleted IS NULL ) 
         AND ( this_.archived = ? 
                OR this_.archived IS NULL ) ) 
       AND this_.context_id = ?

SELECT   this_.id            AS id3_0_, 
         this_.version       AS version3_0_, 
         this_.archived      AS archived3_0_, 
         this_.collection_id AS collection4_3_0_, 
         this_.context_id    AS context5_3_0_, 
         this_.date_created  AS date6_3_0_, 
         this_.d eleted      as deleted3_0_, 
         this_.json          AS json3_0_, 
         this_.last_updated  AS last9_3_0_, 
         this_.media         AS media3_0_, 
         this_.owner_id      AS owner11_3_0_, 
         this_.parent_id     AS parent12_3_0_, 
         this_.quote         AS quote3_0_, 
         this_.source        AS so urce3_0_, 
         this_.text          AS text3_0_, 
         this_.uri           AS uri3_0_, 
         this_.userid        AS userid3_0_, 
         this_.username      AS username3_0_ 
FROM     annotation this_ 
WHERE    (( 
                           this_.deleted=? 
                  OR       this_.deleted IS NULL) 
         AND      ( 
                           this_.archived=? 
                  OR       this_ .archived IS NULL)) 
AND      this_.context_id=? 
ORDER BY this_.date_created DESC

SELECT   this_.id            AS id3_0_, 
         this_.version       AS version3_0_, 
         this_.archived      AS archived3_0_, 
         this_.collection_id AS collection4_3_0_, 
         this_.context_id    AS context5_3_0_, 
         this_.date_created  AS date6_3_0_, 
         this_.d eleted      as deleted3_0_, 
         this_.json          AS json3_0_, 
         this_.last_updated  AS last9_3_0_, 
         this_.media         AS media3_0_, 
         this_.owner_id      AS owner11_3_0_, 
         this_.parent_id     AS parent12_3_0_, 
         this_.quote         AS quote3_0_, 
         this_.source        AS so urce3_0_, 
         this_.text          AS text3_0_, 
         this_.uri           AS uri3_0_, 
         this_.userid        AS userid3_0_, 
         this_.username      AS username3_0_ 
FROM     annotation this_ 
WHERE    (( 
                           this_.deleted=? 
                  OR       this_.deleted IS NULL) 
         AND      ( 
                           this_.archived=? 
                  OR       this_ .archived IS NULL)) 
AND      this_.parent_id=? 
ORDER BY this_.date_created DESC

So it's basically an N+1 query problem, which would explain the pattern of response times that Luis observed. I'm not familiar enough with grails/groovy to make the fix myself and submit it otherwise I would. Hope that helps.

jmiranda commented 9 years ago

Thanks Artie! Nice work. The queries themselves shouldn't be too bad, but hydrating (loading) 100s of annotations into memory might cause the slowness.

I think you nailed it with that particular line as the code is actually loading the entire parent annotation from the database when it should just be loading a reference (proxy) to the parent annotation.

Replacing those lines:

if (params.parentid) {
    def parentAnnotation = Annotation.get(params.parentid)
    parent == parentAnnotation
}

with the following lines:

if (params.parentid) {
    parent == Annotation.load(params.parentid)
}

should improve the performance considerably.

NOTE: I removed the assignment of parentAnnotation because it was unnecessary.

Should be an easy fix.

jmiranda commented 9 years ago

I finally had a chance to test this out and it appears that @arthurian was correct. The load time was improved dramatically by making the change I suggested in my previous comment. I'll commit shortly and build a new WAR file. Deploying to the test server might take a little longer, but definitely this week.

Thanks again for the debugging @arthurian.

jmiranda commented 9 years ago

Should be fixed in release 0.5.9 (https://github.com/annotationsatharvard/catcha/releases/tag/v0.5.9).

@lduarte1991 - I decided to deploy to the test server tonight since I had a few extra minutes so please test the search API and let me know if the load times are better.

lduarte1991 commented 9 years ago

@jmiranda @arthurian I just ran the same test when I made the issue and there seems to be a 33% decrease in response time which is pretty nice :D Thanks Artie and Justin!

lduarte1991 commented 9 years ago

@jmiranda Been working with this for about a week now. It is definitely faster, though obviously it still takes upwards of 5-6 seconds once we are reaching around 1000 annotations or so. A bit worried about how far this will actually scale.

In the meantime though, it is doing a lot better so feel free to push to production at any point.

jmiranda commented 8 years ago

Thanks for the feedback. I'll open a new ticket and try to figure out ways to tweak performance a bit more. My guess is that pulling 1000s of large JSON objects from the database is causing the problem. Not sure if there's much to do there except to switch to an object store as I've always wanted to do. But I'll look to see if I forgot an index or figure out a better way to cache the objects.

I'm just getting back to work after the Thanksgiving holiday and will plan to deploy at some point this week. Let me know if there are any classes going on that would keep me from deploying it at certain times.

annotationsatharvard / catcha

Requests' response time from server take longer as the limit increases #72