Closed chenerlich closed 6 years ago
Yeah, this seems to be about Mongo & celery not getting along well and deadlocking at some point, especially in smaller servers. What kind of specs are you running Yeti on?
For the time being, you can "unlock" feeds using this mongo script:
use yeti;
db.schedule_entry.update({lock: true}, {$set: {lock: false, status: "Unlocked..."}}, {multi: true});
@tomchop 4CPU , 16gb mem
What should i upgrade?
I guess more CPUs / cores couldn't hurt. Can you show me how the services are started? The default is to start 8 processes for feeds and 10 for analytics, so you have 18+ processes hitting mongodb simultaneously on a 4 CPU machine.
You could also try decreasing the amount of workers (check the systemd scripts) in each service to something more adapted to your instance.
@tomchop Changed specs to 8 CPUs, 32 GB RAM. It looks like this:
Is the mongo script still relevant?
TLDR: Yes, I would: a) stop all feeds, b) run script, c) relaunch feeds
The longer answer is that we store the state of the feeds in the db so that the same feed is not launched a second time while it's already running. If they deadlock, the feed is still marked as "running" in the database, but not actually running. Worst case, it keeps a whole worker busy, best case it will never run before it is unlocked. This is a bit hard to reproduce, but after the time we've spent on this we're pretty sure it comes from Celery and Mongo not working together nicely.
Worked great. Thanks!
Perfect ! I've closed this issue !
thanks for this, I ran into the same issue
Description
I integrated the platform about a month ago, and now all the updated data shown in "Browse" is 99% phising. When i look into the dataflows table, i see that while there are feeds that get updated, there are also feeds that don't. I restarted the system\workers\schedulers a few times, and pulled and "syncdb" the repository to its latest state. No change regarding those dataflows. Attached a screenshot for few for few of them:
(Today is the 28/08/2018...)