scripting / feedlandInstall

Instructions for setting up a FeedLand server.
GNU General Public License v3.0
3 stars 1 forks source link

Question for SQL experts #18

Open scripting opened 1 year ago

scripting commented 1 year ago

Here's a news product.

http://news.scripting.com/

The first few tabs are relatively fast, I assume this is because there are lots of updates from the feeds, and a relatively large number of feeds in the category.

But starting with Podcasts, they get very slow. Relatively few feeds that don't have so many updates. Somewhere in the database query it's iterating over all the items, and the loop takes longer because there's more iterating to do for these feeds.

I wonder if this can be improved with another index. If so, any suggestions?

This is an off-the-shelf FeedLand instance with exactly the same schema as in setup.sql.

scotthansonde commented 1 year ago

Looking at the news product and clicking on the Podcasts tab, the call to http://feedland.org/getriverfromcategory?screenname=davewiner&catname=Podcasts takes over 12 seconds. The sql that api point runs is

select * from subscriptions where listname='davewiner' and categories like '%,podcasts,%'

There's already an index on listname, but for categories a normal index won't help a LIKE query with wildcards. However, since the data in categories are comma-separated words, a FULLTEXT index might help. I tried it out on the database for my instance. I added a FULLTEXT index with

ALTER TABLE subscriptions ADD FULLTEXT(`categories`)

Even on my little database that took over 30 seconds. I then changed the query to

select * from subscriptions where listname='davewiner' and match(categories) against('podcasts')

On my little database the new query was about 1/3 faster (0.0013 seconds vs 0.0018 seconds).

scripting commented 1 year ago

@scotthansonde -- thanks for doing this experiment.

i'm gathering from the conclusion that this isn't worth doing?

scotthansonde commented 1 year ago

@scripting I'd say it's inconclusive, since the size of my database and the quality of my hardware (VM on a NAS) doesn't compare to your production database at DigitalOcean.

scripting commented 1 year ago

There's a whole other option for how to do this --

Statically build the category pages.

And only display the statically built pages.

Every 10 minutes the page is rebuilt.

This is how we did it in River5 and previous apps.

There is a cost

You have to rebuild the page every 10 minutes even if no one goes there, just in case.

But that didn't prove to be a deal-stopper for the RiverX's.