Closed stefanbohacek closed 3 months ago
It looks like you have updated from 7.1.1. The only significant change I can think of here, is that in 7.1.2 we introduced improved caching of robots.txt files, which uses sha256 to hash domain names over and over again.
Can you try #157 which uses xxhash instead of sha256 and see if that improves things for you?
(Don't forget to run pip install -r requirements.txt
)
@nanos Thank you! I should also add that I'm running FediFetcher as a GitHub Action. Either way, I pulled in the latest changes and will keep an eye on the server throughout the day.
Ah, in that case this won't make any difference whatsoever.
If you are running FediFetcher as GH Action, then what server are those graphs from?
Right. So that's from the actual stefanbohacek.online Mastodon server. I haven't made any changes to the server itself, and the time when the graph started to change matches exactly when I pulled in the update, so I figured maybe the frequency/amount of API calls might have changed? Would it be helpful to share any logs perhaps?
Oh, I see.
That, quite frankly, makes no sense whatsoever: Changes since 7.1.1 have only served to reduce the load on the actual mastodon server, rather than increase it, with FediFetcher increasingly making use of caching to avoid repeated requests for the same thing.
As such I'm tempted to conclude that this increase is very likely unrelated.
Did you see an uptick in Sidekiq jobs?
Did you make any other changes?
Hm. Looking through your GH Actions logs, there are a couple of things that stand out to me:
replied_toot_server_ids
file is empty.I wonder if these two are connected somehow?
Is it possible your mastodon server is just busier than usual? More notifications maybe?
I sometimes get this when I post something popular, and get dozens of likes: My FediFetcher run time explodes as it's backfilling all those profiles, and that of course has a knock-on effect on the mastodon instance.
Your actions run for a very long time
Good catch! Looking at https://github.com/stefanbohacek/FediFetcher/actions/workflows/get_context.yml?page=17, this strongly hints as having to do with the FediFetcher update.
9:24: 1m 52s 9:35: 1m 50s
9:45: 1m 51s 9:47: merged latest changes 9:55: 1h 54m 43s 10:15: 16m 29s 10:35: 8m 26sDid you make any other changes?
Nope.
Is it possible your mastodon server is just busier than usual? More notifications maybe?
Not particularly.
Did you see an uptick in Sidekiq jobs?
I'll look into this a bit more. I also paused the workflow for now to see if this has any impact.
Good catch! Looking at https://github.com/stefanbohacek/FediFetcher/actions/workflows/get_context.yml?page=17, this strongly hints as having to do with the FediFetcher update.
9:24: 1m 52s 9:35: 1m 50s 9:45: 1m 51s 9:47: merged latest changes 9:55: 1h 54m 43s 10:15: 16m 29s 10:35: 8m 26s
You can't really compare these: The 09:45 one (and all previous runs) error out, so the run time is simply not the run time of a complete run.
I cannot find any previous successful run in your log, so I cannot find a comparison.
I must admit that at the moment I cannot think of any recent changes that could've caused this.
The 09:45 one (and all previous runs) error out
Ah, sorry, I missed that.
Either way, still keeping an eye on the server, but so far things seem to be calming down after pausing the workflow.
I'll try to poke around the logs a bit more and see if there's anything else going on.
Yeah, it doesn’t surprise me to see that it’s calmed down. With backfilling that many posts it’ll have an impact.
I’m just not sure why it would backfill that much. In a cursory glance I couldn’t see any obvious duplication, hence asking earlier whether your server was just very busy.
You could try to turn off from-notification
, backfill-with-context
, and/or backfill-mentioned-users
. These three will be having by far the biggest impact (particularly backfill-with-context
) but obviously they’ll also mean you’re missing out on some functionality (see the readme for a description of each option). Up to you to decide whether it’s worth doing.
I think I'm going to close this ticket for now. You confirmed that there were no issues with the update, and gave me a few pointers to look at.
And looking at the past FediFetcher runs, I am wondering if it's the other way around, maybe this has always been an issue, but the recent update fixed whatever was causing the workflow to error out.
I will go over the settings and see if I can tweak them better. Thank you, I appreciate all your help!
OK, let me know if you need any more guidance at any time, or if you find out anything that would be helpful to share, please 👍
Just to follow up on this, adding a swap partition helped me a lot. Also, my updated settings:
{
"server": "stefanbohacek.online",
"home-timeline-length": 20,
"backfill-with-context": 0,
"max-followings": 80,
"max-followers": 80,
"max-follow-requests": 80,
"max-bookmarks": 80,
"from-notifications": 0
}
It might've been my home timeline taking up too much time to crawl. I'm switching to a more targeted approach with favorites and bookmarks.
Either way, things are looking good now!
After a recent update (https://github.com/stefanbohacek/FediFetcher/commit/df484696d98626295c8569e8804ad2b597a41da3), the CPU and bandwidth usage of my Mastodon server have increased, see attached screenshot from DigitalOcean.
Here's my
config.json
file:I have not made any changes other than fetching the latest code as shown above.
The droplet has 2 virtual CPUs, 4 GB RAM, and a 50 GB SSD.