USEPA / EPA_Environmental_Dataset_Gateway

U.S. EPA’s Metadata Catalog
https://edg.epa.gov
3 stars 2 forks source link

Home page performance improvements #22

Closed torrin47 closed 7 years ago

torrin47 commented 7 years ago

From @torrin47 on June 13, 2017 18:21

So I'm concerned that even though the new page is performing better in production than staging, it's still putting a lot of load on the server. One way we could pretty easily improve the performance is if instead of doing direct queries to the server for all of the thumbnail displays, we could create caches of each JSON output that are updated hourly. I'm pretty comfortable putting together the script that would produce the cache .json files, but I believe that right now the page is structured in a way that derives the "See More" link from the search term we use to pull the thumbnails. We'd need those to be separate, so that even though the thumbnails are coming from cached json, the "See More" link will still go to the search page with the right search term.

_Copied from original issue: Innovate-Inc/EDGmetadata#124

torrin47 commented 7 years ago

Unfortunately, the performance was so slow on production that our monitoring script decided the site was down 4 times in the first hour. Can't live with that, rolled back to old version until we can address stability.

torrin47 commented 7 years ago

So I was reviewing the logs trying to understand the performance impact, and noticed one thing right at the outset. Each REST query ended up making 3 separate calls, following this pattern:

2017-06-13 22:39:45 134.67.221.182 GET /metadata/RestQueryServlet start=1&max=6&f=json&owner=11&max=6 443 - 134.67.221.182 Java/1.8.0_112 200 0 0 226 2017-06-13 22:39:45 134.67.221.182 GET /metadata/RestQueryServlet start=1&max=6&f=json&owner=11&max=6 443 - 134.67.221.182 Jakarta+Commons-HttpClient/3.1 200 0 0 208 2017-06-13 22:39:45 134.67.221.182 GET /metadata/rest/find/document owner=Region%209&start=1&max=6&f=json 443 - 134.67.221.182 GeoportalServer 200 0 0 469

I don't really know what the RestQueryServlet is or how requests are directed there, but I tried switching the URLs we have embedded in the homeBody.jsp to point directly at the RestQueryServlet, and it dramatically cut down on the number of hits logged, reduced the page loading time from ~30s to ~13s. Which is pretty sweet, but I still think it's worth pursuing the cache option. I'll check in the tweaks.

torrin47 commented 7 years ago

From @Saisuma004 on June 14, 2017 13:53

Sounds good, I will start preparing cache json files.

On Tue, Jun 13, 2017 at 4:36 PM, Torrin Hultgren notifications@github.com wrote:

So I was reviewing the logs trying to understand the performance impact, and noticed one thing right at the outset. Each REST query ended up making 3 separate calls, following this pattern:

2017-06-13 22:39:45 134.67.221.182 GET /metadata/RestQueryServlet start=1&max=6&f=json&owner=11&max=6 443 - 134.67.221.182 Java/1.8.0_112 200 0 0 226 2017-06-13 22:39:45 134.67.221.182 GET /metadata/RestQueryServlet start=1&max=6&f=json&owner=11&max=6 443 - 134.67.221.182 Jakarta+Commons-HttpClient/3.1 200 0 0 208 2017-06-13 22:39:45 134.67.221.182 GET /metadata/rest/find/document owner=Region%209&start=1&max=6&f=json 443 - 134.67.221.182 GeoportalServer 200 0 0 469

I don't really know what the RestQueryServlet is or how requests are directed there, but I tried switching the URLs we have embedded in the homeBody.jsp to point directly at the RestQueryServlet, and it dramatically cut down on the number of hits logged, reduced the page loading time from ~30s to ~13s. Which is pretty sweet, but I still think it's worth pursuing the cache option. I'll check in the tweaks.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/Innovate-Inc/EDG_metadata/issues/124#issuecomment-308278343, or mute the thread https://github.com/notifications/unsubscribe-auth/AQVVNmT-NDmVjdap_3VGOjGxWAbUbNMnks5sDx0QgaJpZM4N44F5 .

-- Sumalatha Malothu Innovate!, Inc. Cell: (318) 278-4228 http://www.innovateteam.com

torrin47 commented 7 years ago

We finished this!