Open scripting opened 11 months ago
Also I updated the feedland repo, it had been a while. You can see where the debugging calls have been added for river caching.
The rivercache-specific code starts here.
Updated our instance with new versions
Hey @scripting, here's a snippet of the applications logs:
09:01:36 PM POST a8c.feedland.org:80 /publishfile https://a8c.feedland.org/?river=true&screenname=dave ip6-localhost
09:01:36 PM - publishStaticFileInSql: screenname == fernando.fernandes@automattic.com, relpath == appPrefs.json
09:01:36 PM - getStaticFileInSql: screenname == fernando.fernandes@automattic.com, relpath == appPrefs.json
09:01:39 PM - addToRiverCache: cachekey == category:dave/All
09:01:39 PM - getRiverFromCategory: cachekey == category:dave/All, 4.154 secs.
09:01:58 PM - logSqlCalls: 9:01:58 PM, ctsecs == 0.015, options.result.length == 1093.
09:01:58 PM - logSqlCalls: sqltext == select * from subscriptions where listname='dave' and categories like '%,All,%';
09:02:02 PM - addToRiverCache: cachekey == category:dave/All
09:02:02 PM - getRiverFromCategory: cachekey == category:dave/All, 4.06 secs.
09:02:02 PM - clearCachedRivers: deleting cache for the river whose key is category:dave/All
09:02:03 PM - checkOneFeed: feedUrl == https://tommycollison.substack.com/feed
09:02:03 PM - clearCachedRivers: deleting cache for the river whose key is category:dave/All
9:02:07 PM GET a8c.feedland.org:80 /getriverfromcategory https://a8c.feedland.org/?river=true&screenname=dave ip6-localhost
09:02:07 PM -logSqlCalls: 9:02:07 PM, ctsecs == 0.011, options.result.length == 1093.
09:02:07 PM - logSqlCalls: sqltext == select * from subscriptions where listname='dave' and categories like '%,All,%';
09:02:11 PM - addToRiverCache: cachekey == category:dave/All
09:02:11 PM - getRiverFromCategory: cachekey == category:dave/All, 4.017 secs.
09:02:18 PM - checkOneFeed: feedUrl == https://www.redreporter.com/rss/index.xml
09:02:19 PM - notifySocketSubscribersFromSql: idLastNewItem == 268280
09:02:19 PM - clearCachedRivers: deleting cache for the river whose key is category:dave/All
Note that you can see that live if you check the VIP dashboard. So by having a quick look, indeed something seems to be triggered the cache clear. Shouldn't that be cleared only when whenCreated
> ctSecsLifeRiverCache
? It's currently 300 secs and by checking the timestamp for each log message this is happening way before that timeframe.
Timing out is one of two reasons a river is removed from the cache.
The other reason is that a URL that's in the river has updated. I can explain what that means if it's not clear.
And because the call was from clearCachedRivers and not clearOldCachedRivers we can infer this was because a feed updated, so let's add another console.log that tells us what URL of the feed that it thinks updated.
It makes sense the problem would be in this area, possibly, because I just added a new mechanism for that to be called.
I'm sorry I removed the console.log that spews out the ID of the item that caused the update. I'll take a look at that too.
New versions of feedland, feedlanddatabase 0.5.99 and 0.6.27.
One change -- clearCachedRivers now also reports the feedUrl that caused the river to be removed from the cache.
@fmfernandes -- are we going to pick this up tomorrow or should i hang around? i'm happy to go watch the news and have a beer. ;-)
no matter what i'll get the emails..
Hey @scripting, I'll be around for another hour or so. I've just updated our instances again! Deploying now.
Ok, I’ll watch for more data.
New logs:
10:19:11 PM GET a8c.feedland.org:80 /sendprefs https://a8c.feedland.org/?river=true&screenname=dave ::ffff:127.0.0.1
10:19:11 PM - 10:19:11 PM GET a8c.feedland.org:80 /getriverfromcategory https://a8c.feedland.org/?river=true&screenname=dave ip6-localhost
10:19:11 PM - getRiverFromCategory (serving from cache): cachekey == category:dave/All
10:19:14 PM GET a8c.feedland.org:80 /getriverfromcategory https://a8c.feedland.org/?river=true&screenname=dave ::ffff:127.0.0.1
10:19:14 PM - logSqlCalls: 10:19:14 PM, ctsecs == 0.014, options.result.length == 1093.
10:19:14 PM - logSqlCalls: sqltext == select * from subscriptions where listname='dave' and categories like '%,All,%';
10:19:15 PM POST a8c.feedland.org:80 /publishfile https://a8c.feedland.org/?river=true&screenname=dave ::ffff:127.0.0.1
10:19:15 PM - publishStaticFileInSql: screenname == fernando.fernandes@automattic.com, relpath == appPrefs.json
10:19:15 PM - getStaticFileInSql: screenname == fernando.fernandes@automattic.com, relpath == appPrefs.json
10:19:18 PM - addToRiverCache: cachekey == category:dave/All
10:19:18 PM - getRiverFromCategory: cachekey == category:dave/All, 4.387 secs.
10:19:19 PM - 10:19:19 PM GET a8c.feedland.org:80 /getfeed https://a8c.feedland.org/?river=true&screenname=dave ::ffff:127.0.0.1
10:19:19 PM GET a8c.feedland.org:80 /getfeed https://a8c.feedland.org/?river=true&screenname=dave ip6-localhost
10:19:23 PM - checkOneFeed: feedUrl == https://inphotos.org/feed
10:19:25 PM - checkOneFeed: feedUrl == https://elizasj.substack.com/feed
10:19:38 PM - checkOneFeed: feedUrl == http://www.nytimes.com/svc/collections/v1/publish/www.nytimes.com/column/frank-bruni/rss.xml
10:19:40 PM - checkOneFeed: feedUrl == https://courtneyscott.tv/feed
10:19:53 PM - checkOneFeed: feedUrl == http://feeds.feedburner.com/PlacesJournal
10:19:55 PM - checkOneFeed: feedUrl == http://feeds.feedburner.com/AndroidPolice
10:19:55 PM - clearCachedRivers: deleting cache for the river whose key is category:dave/All. feedUrl == http://feeds.feedburner.com/AndroidPolice
10:19:55 PM - Samsung could debut a game streaming service exclusively for Galaxy devices
10:19:55 PM - Your phone will sound an alarm on Wednesday, but there’s no need to panic
10:19:55 PM - WhatsApp is picking up some advanced text formatting options
10:19:55 PM - Best OnePlus 10T screen protectors in 2023
10:19:55 PM - Create your own maps with the latest Google Earth update
10:19:55 PM - Upgrade your car to wireless Android Auto for less with this discounted adapter
10:19:57 PM - notifySocketSubscribersFromSql: idLastNewItem == 268493
10:19:58 PM - notifySocketSubscribersFromSql: idLastNewItem == 268493
10:19:58 PM - clearCachedRivers: deleting cache for the river whose key is category:dave/All. feedUrl == http://feeds.feedburner.com/AndroidPolice
Looks like it was triggered by a feed update indeed
Thanks for the report. I’ll think about next steps.
There is information here. I will have to look at more of the log. Both times it was Android Police. It must be rssCloud pinging us because we wouldn't be checking it that often.
Maybe I'll try unsubbing from Android Police and see if that gets rid of this issue.
Guess what, when I unsubbed from Android Police the problem went away.
Try reloading this page over and over. It's obviously loading from the cache.
Something else i want to look in in the morning.
clearCachedRivers is called from a number of places it probably doesn't need to call from, because now we're doing SQL-based change detection (the same flow that goes to the clients is also going through clearCachedRivers).
At times I wonder if we need to keep the non-chronologic river cache clearing, but we do need it. here's the use-case..
Suppose you just unsub from a feed that dumped a shitload of garbage into your river, with the intention of getting it out of your way. So you go unsub, and because we haven't thrown out the cache, when you look at the river the garbage is still there! It looks like a bug because it is. When you unsub it's immediate.
Less of a problem for subscribing, though we need to clear the river cache for that too.
It's so weird that this system behaves so differently from feedland.org. I would have thought we were already past that.
See you tomorrow! :-)
Good morning. I'm feeling pretty sick today, it's a problem i've been seeing the doctor about, and this morning it was pretty bad, so I had to take care of my health first. But i'm feeling better now, and i'm going to try to implement the fix i described in the previous comment.
I'll post notes here as it progresses.
Okay so if config.flUseSqlForSockets is true, that means we're only sending updates to the users from the SQL method of locating changes. In fact we're still sending updates from there and from the scanning we do and responses to rssCloud calls.
At first I thought -- who cares if you get notified twice of an update, which will happen sometimes, how often it happens depends on how many instances are running.
And it does matter, as we're finding here. Maybe. We still don't understand why the updates are coming for the feeds they're coming for.
But I'm going to straighten this out now and then we'll see where we're at.
New versions of feedland and feedlanddatabase versions 0.5.100 and 0.6.28.
I've brought the feedland repo up to date if you want to review the very small number of changes, and i'm happy to explain.
https://github.com/scripting/feedland/blob/main/worknotes.md
https://github.com/scripting/feedland/blob/main/database/worknotes.md
maybe jump on a quick voice chat to talk this over?
Just sent a voicemail explaining, over on the slack channel. Thought I should include it here too make it easier to find in the future. (But it turns out they don't support MP3.)
That's fine! A link to the message will work. Did you test checking your river's All tab?
Yes, the performance is perfect. But what i want to know is, when you read the logs, do you see the cache getting tossed when it makes no sense??
I had a thought for optimization later -- like a year from now..
We can send a message to the client that we tossed the river from the cache, to which it can send the message back "hey the user has this river on their screen, so why don't you recalc it and put it back in the cache."
If it isn't on their screen, the client ignores the message.
That way even if the cache got tossed, the user never sees the delay.
But what i want to know is, when you read the logs, do you see the cache getting tossed when it makes no sense??
I've been monitoring it since the update and everything seems fine.
Excellent. I think we're starting to define our roles.
You are responsible for the server, I'm responsible for the software.
BTW, I'm working on the FeedLand docs. I'll need people to carefully read them, look for problems. Report issues. We treat the docs like software.
I think this is going to be an open issue for some time. Thinking about new strategies to improve performance.
Here's an anecdote. I still have my iPad configured to use feedland.org. I am depressed that there usually is no delay loading the All tab there. I have a theory that the reason is that it is also part of news.scripting.com and it's getting enough traffic now that someone else is taking the hit. The probability that I am the first one to read that tab after the cache is cleared is much smaller than it is on A8C.feedland.org.
I'm still getting a lot of 4-second loads on the All tab on A8C.
This is leading me to think of different approaches here.
Maybe -- turn off all url-based cache clears and only do it on time, and increase the time-to-live to something like 10 or 15 minutes.
Maybe -- optimize it for paying customers by regenerating the page immediately after clearing the cache, so that if the user goes there it's more likely the page will already be loaded. Basically emulating the anecdote above.
If we did either of the previous, there would have to be an exception for unsubs. When you unsub you want that to be seen immediately with or without a delay. So that message has to get through. Easy enough technically, but when reading the code, it'll require a lot of explanation with links to comments like this one.
Just some thoughts...
@fmfernandes -- new versions of feedland and feedlanddatabase, versions 0.5.98 and 0.6.26 respectively.
I added console.log's in several places:
This will give us an idea of where we might look next for problems.
I have an idea of a way to get more info out of the system, by adding an API call that returns a structure representing what's in the cache at any moment.
But let's see what we get from this.