scripting / feedlandDev

A place for discussions among people working on the development of FeedLand
MIT License
0 stars 0 forks source link

Going through "all" category feed by feed #4

Open scripting opened 4 months ago

scripting commented 4 months ago

Another way of testing this.

So far only my All category is really slow. The goal here is to find out what's special about that category.

  1. I exported the All category to an OPML file, logged on as davewiner.
  2. I logged off and signed on as bullmancuso.
  3. Imported the OPML file from step 1. There were 938 files in the list. There were a fair number of bad feeds as you'd expect.

Notes on importing subscriptions

The timeouts are too long. We could detect a bad feed with a timeout of say 3-5 seconds. If a server hasn't responded in that time, it's almost certainly not going to respond after a minute (or whatever the timeout is).

We were able to subscribe to some feeds, but there were errors trying to set the categories. The message says we aren't subscribed. Implies there was an error subscribing that wasn't reported?

We report at the end that we've subscribed to 938 feeds, but the truth is we tried to subscribe to them. The actual number should reflect errors.

A couple of the subscriptions were the result of bad redirects in transitions done on the A8C server. Not placing blame, mistakes always happen, but I think these could have been avoided with more careful testing. Water under the bridge.

I remember writing a "lint" type app that reads an OPML file and eliminates feeds that can't be reached. I will find it and link it into the home page of this repo. I'm on a hunt for those things. This site is the place for tools for technical people, and serious work on and debugging of the product.

The feeds we couldn't subscribe to

https://feedland.com/feeds/davewiner.xml https://openinternetalliance.net/feed https://walkouttovote.org/feed https://shop.tumblr.com/feed http://quiet.com/feed https://zephyrstrategy.com/feed https://www.patreon.com/blog/feed http://rssfeeds.usatoday.com/UsatodaycomMovies-TopStories https://d-s.sh/index.xml https://a8c.feedland.org/feeds/davewiner.xml https://perpendicular.blog/feed https://feedland.com/feeds/davewiner.xml http://partner.mlb.com/partnerxml/gen/news/rss/hou.xml http://partner.mlb.com/partnerxml/gen/news/rss/kc.xml https://dougbelshaw.com/blog/feed https://dlcid.com/feed https://espd55.com/feed https://kamal.blog/feed http://shaylachristine.blog/feed https://openinternetalliance.net/feed

The feeds we couldn't set categories on

http://www.chicagotribune.com/sports/baseball/cubs/rss2.0.xml https://www.chicagotribune.com/arc/outboundfeeds/rss/section/opinion/ http://rssfeeds.azcentral.com/phoenix/diamondbacks&x=1

The result

When I looked at the All category from the imported feeds this was the result.

displayTraditionalRiver: getRiver took 18.273 secs.

So the problem appears here as well.

Next step

I'm going to write a script to run in feedlandHome that goes through the list of subs for bullmancuso, and for each feed:

  1. Unsub from the feed.
  2. Render the river for the All category.
  3. Note how long it took.

The expectation is that at some point we will unsub from a feed and the rendering will take substantially less than 18 seconds. At that point we will have a feed to look at, assuming it isn't the very last feed that does this. :-)

scripting commented 4 months ago

I've got the test script written and running. It's going to keep unsubbing and building the river for the all category and stops when it takes less than 15 seconds to do it.

function unsubTestOne (callback) {
    var testResults = new Array ();
    if (localStorage.testResults !== undefined) {
        testResults = JSON.parse (localStorage.testResults);
        }
    function getRandomFeed () {
        const ix = random (0, globals.userSubscriptions.length - 1);
        const theFeed = globals.userSubscriptions [ix];
        const feedUrl = theFeed.feedUrl;
        globals.userSubscriptions.splice (ix, 1);
        return (feedUrl);
        }
    function saveTestResults () {
        localStorage.testResults = jsonStringify (testResults);
        }
    const feedUrl = getRandomFeed ();
    console.log ("unsubTestOne: feedUrl == " + feedUrl);
    unsubscribe (feedUrl, function (err, data) {
        if (err) {
            console.log (err.message);
            testResults.push ({feedUrl, err});
            saveTestResults ();
            callback (true); //keep going
            }
        else {
            const riverSpec = {catname: "All", screenname: "bullmancuso"}, whenstart = new Date ();
            console.log ("unsubbed from feed, calling getRiver");
            getRiver (riverSpec, "bullmancuso", function (err, theRiver) {
                if (err) {
                    console.log (err.message);
                    testResults.push ({feedUrl, err});
                    saveTestResults ();
                    callback (true); //keep going);
                    }
                else {
                    const ctsecs = secondsSince (whenstart);
                    console.log (ctsecs + " secs.");
                    testResults.push ({feedUrl, ctsecs});
                    saveTestResults ();
                    callback (ctsecs > 15); //if things got faster we want to stop
                    }
                });
            }
        });
    }
function unsubTestLoop () {
    console.log ("unsubTestLoop");
    function doNext () {
        unsubTestOne (function (flKeepGoing) {
            if (flKeepGoing) {
                if (globals.userSubscriptions.length > 0) {
                    doNext ();
                    }
                }
            });
        }
    doNext ();
    }
scripting commented 4 months ago

As I'm watching the script run, it's really fascinating, I realize that rather than limiting the number of items in a river, which I felt was "going the wrong direction" -- it probably could work to limit the number of feeds in a category. If that turns out to be the issue.

scripting commented 4 months ago

Another comment, reading the script you can see what a disaster JavaScript is. It's the most ridiculous language ever. You should be able to write loops the way you think of them and let the freaking interpreter figure out how to turn it into the mess you have to write. And yes I know about await and promises, and they don't help because they don't belong in my code. That I'm waiting for the result of a function I call is the normal thing, you should have to bend over backwards to start a new thread, which actually is almost never what you want.

scripting commented 4 months ago

BTW, one thing that's pretty clear, is that it gradually gets faster as the number of feeds goes down. Sort of what you'd expect. So maybe there isn't a magic feed. Right now with still 800 or so usubs to go, it's doing the river in 14 secs.

I'm keeping a log of course. ;-)

scripting commented 4 months ago

Here's a link to the subscription list cleanup app.

https://github.com/scripting/reallysimple/tree/main/demos/subscriptionListCleanup

I had already made an attempt to organize these utilities in the reallySimple repo, no need to reproduce it here.

scripting commented 4 months ago

Well I got the result I was looking for. Right after cmdrtaco, with martynlawrencebullard.com. I'll be back in a bit after some checking.

3:24:05 PM: 3.986 secs, feedUrl == http://www.theguardian.com/travel/usa/rss, 200 subs remain.
3:24:09 PM: 3.863 secs, feedUrl == http://cmdrtaco.blog/feed, 199 subs remain.
3:24:09 PM: 0.154 secs, feedUrl == https://martynlawrencebullard.com/feed, 198 subs remain.
3:24:10 PM: 0.139 secs, feedUrl == https://www.techrepublic.com/rssfeeds/articles/, 197 subs remain.
3:24:10 PM: 0.157 secs, feedUrl == http://www.theguardian.com/world/americas/rss, 196 subs remain.
3:24:10 PM: 0.153 secs, feedUrl == https://indieweb.org/this-week/feed.xml, 195 subs remain.
3:24:10 PM: 0.141 secs, feedUrl == https://c19foundation.org/feed, 194 subs remain.
3:24:11 PM: 0.146 secs, feedUrl == http://www.theguardian.com/business/diversity-and-equality/rss, 193 subs remain.
scripting commented 4 months ago

I unsubbed from both cmdrtaco and martynlawrencebullard on davewiner/all and got the same 18 second result.

I wonder if the magic is with the number of feeds in this category.

It's possible I'm the only one with a category with more than 200 feeds in it (no magic to 200, but that is near where it got unchained). The only person with more subs than me is Chuck. I wonder if Chuck has a category he displays with as many feeds as my All category.

A lot of these feeds haven't updated in a very long time. It's possible we could just put a limit on the number of feeds in a category.

scripting commented 4 months ago

Here are the test results, as a JSON array in chronologic order.

How to read it. The ctsecs value is the number of seconds it took to render the All category river after we unsubbed from the feedUrl.

fmfernandes commented 4 months ago

Hey @scripting, interesting write up! I'm at least less worried that the issue is not a bad feed.

it probably could work to limit the number of feeds in a category

I agree!

wonder if Chuck has a category he displays with as many feeds as my All category.

I can check and report in a while. But checking Chuck's river, it loads near instant for every tab.

Have you also considered lowering the number of columns we select from items to build the river? Or do we really need SELECT *?

scripting commented 4 months ago

@fmfernandes -- we don't need select *.

i'll take a look.