eveseat / seat

🌀✳️ SeAT: A Simple, EVE Online API Tool and Corporation Manager
https://eveseat.github.io/docs/
GNU General Public License v2.0
431 stars 143 forks source link

Queue Mangement - working Jobs issue #42

Closed apfelqoo closed 7 years ago

apfelqoo commented 8 years ago

wj-issue

Is there any way to stop/remove those Job´s from que?

leonjza commented 8 years ago

php artisan seat:cache:clear.

Can you try see if there are any exceptions in the logfiles from about 2 days ago that can help me debug this?

Maarten28 commented 8 years ago

I'm having similar issues where jobs are running for 14+h now, while others do finish. The only thing which I came across in the workers log is the following error:

[PDOException] SQLSTATE[HY000] [2002] Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)

When I ran the code above it does not show any working jobs anymore. It therefore seems to be more an issue of caching?

leonjza commented 8 years ago

@Maarten28 That error only means your MySQL Server was down. As for the command, it simply clears all jobs. Not exactly a caching thing.

Maarten28 commented 8 years ago

In that case, I'm currently running the app with about 400 APIs, of which 1 is a corp API. Before I was running the beta version of SeAT, which did not have any issues on this front. At the moment my best guess (compared to the beta) is that the API calls are taking longer then before. One difference compared to this bug is that my APIs are updating through various stages and the last updated time is being updated to something more recent, so it might be a different bug compared to the one described above.

leonjza commented 8 years ago

A bit of a vague description. I need one of the keys that were 'stuck'. Feel free to pm me on slack/irc with the key & vcode.

Maarten28 commented 8 years ago

At the moment I do not have such a key at hand, since I just deleted my working jobs queue. However, since the whole queue got stuck, I would say at least 50 keys (since I tested whether the issue persisted with 50 instead of 20 workers) have this issue. I'll inform you when I have a key I can share.

leonjza commented 8 years ago

I finally had a few that got 'stuck' an hour or so ago. I had a connectivity related problem on the server, so it seems like a case where the job that times out is not reaped.

ghost commented 8 years ago

Having this issue also with a corporation key, all account keys process through fine, corporation key seems to hang on Processing: ContractsItems every time, 15 hours now and it is still on that. Tried clearing the cache and restarting, no joy.

leonjza commented 8 years ago

Could you guys drop the --tries 10 value to --tries 1 in your supervisord.conf, flush caches with php artisan cache:clear && php artisan seat:cache:clear and report back after while?

ghost commented 8 years ago

I had a continued issue after dropping --tries 10 down to --tries 1. Seems to have resolved itself now though, I flushed caches, updated to latest version (Forgot to do that before but done now), ran keys update and had 4 stuck, cleared caches again, updated supervisor to go from 1 worker to 4, ran a keys update and none stuck. No idea if anything I did resolved it or if it just decided to play nice, my corporation key has also finally managed to get all the way through, that was the main one that got stuck. I will keep an eye on it and let you know if I get any that get stuck again.

leonjza commented 8 years ago

@MythicalDreams be sure to bump this expire value up in your install too: https://github.com/eveseat/seat/blob/master/config/queue.php#L73

warlof commented 8 years ago

@leonjza it could be related to an unreachable CCP api server. I got this issue on a fresh install too and got "unreachable api.eveonline.com" in logs.

Trying to ping the server from multiple location and all packets where lost.

leonjza commented 8 years ago

Yes, I have realized that the server may respond with a 403, then the process gets stuck too.

Hightech1011 commented 8 years ago

I am running into the same issue. I have done a fresh install. I followed the instructions for CentOS found "https://github.com/eveseat/seat/wiki/CentOS-6.x-Installation" and made modification to fit the platform I have 'SeAT" running in.

Details:

Hosted VPS running CentOS 6.5 Apache Version 2.2.31 PHP 5.5.32 (cli) (built: Feb 11 2016 00:08:12) mysql Ver 14.14 Distrib 5.6.29, for Linux (x86_64) using EditLine wrapper cPanel Version 54.0 (build 14)

All of the install process was done as root. Username were changed for the platform. Web interface work correctly and accepts APIs, but unable to pull the APIs due to hung working jobs.

As requested by warlof, I have provided:

php seat_path/artisan seat:admin:diagnose - Results iptables -L (huge, server runs CSF with a large list of banned IPs due to attempted bruteforce attempts to other hosted sites) - https://drive.google.com/file/d/0B98rlAIeHlHmbVJJLW1XWDhkVW8/view?usp=sharing

seat_path/storage/logs/laravel.log content - https://drive.google.com/file/d/0B98rlAIeHlHmZXV5SFZlZXFOakk/view?usp=sharing supervisor status -

supervisorctl status

seat1 RUNNING pid 15881, uptime 3:25:04 redis statu - Results

op6sie commented 8 years ago

I have about 180 apikey's and updated seat yesterday, i tried Ubuntu and now am on CentOS7 with MySQL en MariaDB both run to slow to clear the cue within an hour getting the information from the 180 apikey's is not a problem. are there minimum requirements to the machine?

i have used the 1 line installer from this site.

leonjza commented 8 years ago

@op6sie I suggest you join slack to discuss your performance question. It is not on topic for this ticket now.

Bingmano commented 8 years ago

I am also experiencing this issue. It seems to happen if an API key is deleted by a user on the eve side, or if the API servers do not return a hello fast enough. If it cannot resolve the API key, it doesn't produce an error it just sits in limbo for days. Could we look into adding php artisan seat:cache:clearas a button or have better handling of hangs?

leonjza commented 8 years ago

No the error occurs when an unexpected HTTP code is returned. A fix is pending.

shibdib commented 8 years ago

same issue https://i.imgur.com/ysQlr8g.png

looking forward to a fix

Hightech1011 commented 8 years ago

I have found a quick fix for this when jobs appear to be stuck. In my case, a restart of "supervisord" (I'm running CentOS 6.5) restarts the processing of keys, without having to clear the SeAT cache.

leonjza commented 8 years ago

Pushed some updates for this. Clear cache and monitor how it goes.

Bingmano commented 8 years ago

So far so good, no hanging processes on my end.

shibdib commented 8 years ago

http://i.imgur.com/V1RpjFf.png still getting some

much less than before. (This is a seat install that has about 400 keys)

op6sie commented 8 years ago

Currently my vmserver had an update problem once restored i'll check Op 4 mrt. 2016 07:24 schreef "shibdib" notifications@github.com:

http://i.imgur.com/V1RpjFf.png still getting some

— Reply to this email directly or view it on GitHub https://github.com/eveseat/seat/issues/42#issuecomment-192125332.

leonjza commented 8 years ago

I can also see the odd job still getting stuck. But its no longer blocking on ApiKeyInfo calls that respond HTTP 403 for some reason..

Bingmano commented 8 years ago

I have 2 hangs as well. http://i.imgur.com/JqEX41F.png

Hightech1011 commented 8 years ago

For this issue, can an option be added in the "Status" section to delete specific queued tasks? This would remove the need to clear the caches which resets the total jobs count.

Bingmano commented 8 years ago

For this issue, can an option be added in the "Status" section to delete specific queued tasks? This would remove the need to clear the caches which resets the total jobs count.

I agree with this, at least add a button to the command list so i don't have to keep logging in to enter the clear cache command.

Nutbolt52 commented 8 years ago

I have this issue as well on my SeAT install. I am on Ubuntu, its all up to date, and it happens on random keys, but also the one which seems to hang first is checking the server status. This is frustrating as after about 2 or 3 weeks all 25 keys in my install hang and so nothing is updating. Let me know what you need (ping me on slack), to help continue troubleshooting this. Thanks! Much appreciated!

Bingmano commented 8 years ago

Is there any movement on this?

warriorsoul15 commented 8 years ago

Honestly seat should fail jobs by x-time and clear them from the queue.

On Wed, Jul 6, 2016 at 3:03 PM, Bingmano notifications@github.com wrote:

Is there any movement on this?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/eveseat/seat/issues/42#issuecomment-230872155, or mute the thread https://github.com/notifications/unsubscribe/AKwStPxc8pF3X57iHfn9yTUNlYshDfh9ks5qS_vwgaJpZM4G9-An .

leonjza commented 8 years ago

Could you guys give me some indication on how many jobs you have in the failed table?

select count(*) from failed_jobs;

Output should be something like (yes this is my real failed count):

mysql> select count(*) from failed_jobs;
+----------+
| count(*) |
+----------+
|      339 |
+----------+
1 row in set (0.00 sec)
Bingmano commented 8 years ago
MariaDB [seat]> select count(*) from failed_jobs;
+----------+
| count(*) |
+----------+
|       37 |
+----------+
1 row in set (0.00 sec)

worst spacing ever but it gets the point across

Nutbolt52 commented 8 years ago

246; and I have about 15 api keys in SeAT, and running for ~1 to 2 months

leonjza commented 8 years ago

After a lot of debugging, I pushed eveseat/eveapi@bd3e928 to attempt to fix the 'stuck jobs' issue. If the tests pass and it works ok in my staging env then I'll tag a release.

leonjza commented 8 years ago

Changes tagged and release. Make sure you update to eveseat/eveapi@1.0.15 (using the upgrade script here) and provide feedback.

leonjza commented 8 years ago

Also make sure you have run php artisan seat:cache:clear at least once after the upgrade.

op6sie commented 8 years ago

Just updated to 1.0.15 prior 61 failed jobs on 35 apikey's and about 35k on jobs.

letting you know if the fail count goes up (without me doing it wrong). php artisan seat:cache:clear executed.

Bingmano commented 8 years ago

Updated to 1.0.15 and am now experiencing the following issue (i did clear the cache a day ago).

Job ID 0WRXRFFaHbpY4Hicy8ksjM62RBpvIFvK Api Server Scope Server Status Error An unknown failure in Seat\Eveapi\Jobs\UpdatePublic occured.

leonjza commented 8 years ago

Updated to 1.0.15 and am now experiencing the following issue (i did clear the cache a day ago).

Yay. And the job is not stuck?

Bingmano commented 8 years ago

nope, there's a failure and also a stuck job :(

image

op6sie commented 8 years ago

No i have tested a few day's ( 11513 jobs so far) with 4 stuck jobs.

I use this to clear up the Queued Jobs but there still are failed in the query you gave earlier.

update seat.job_trackings set status = 'Done' where status != 'Done' and status != 'Working';

leonjza commented 8 years ago

Alright. I guess the next logical step is going to be to create a reaper job for these. Really hoped to be able to get to the bottom of this, but I have run out of options now.

leonjza commented 8 years ago

The queue clearing command is now now released. Update using the upgrade script and it should be part of the ecosystem now.

Bingmano commented 8 years ago

Applied the update today but i'm still getting the below error:

Job ID 06NoCFuHscVkU6li3GBlR9FXRWkssH1X
Api Server
Scope Server
Status Error

An unknown failure  in Seat\Eveapi\Jobs\UpdatePublic occured. Refer to the logs at 2016-07-23 18:50:06 for more information.
Bingmano commented 8 years ago

Log:

[2016-07-23 18:50:06] local.ERROR: A job failure occured in Seat\Eveapi\Jobs\UpdatePublic. Marking it as failed.
[2016-07-23 18:55:02] local.WARNING: A job for Api Server and owner 0 already exists.
[2016-07-23 19:00:02] local.WARNING: A job for Api Server and owner 0 already exists.
leonjza commented 8 years ago

The last update simply adds a job that will cleanup jobs like these every day.

molten360 commented 8 years ago

I just ran into this issue tonight. It cant query Contractitems from one of my Corp API's Cleared the cache and restarted it, hung at the exact same spot.

Ubuntu 14.04 running the latest version of SeAT.

leonjza commented 8 years ago

Poking this for a last time. Hows things going? Are jobs that get 'stuck' getting cleared etc?

Bingmano commented 8 years ago

After a certain amount of time it will produce a failed job but seems to no longer get stuck in queued.

On Aug 25, 2016 12:10 PM, "Leon Jacobs" notifications@github.com wrote:

Poking this for a last time. Hows things going? Are jobs that get 'stuck' getting cleared etc?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/eveseat/seat/issues/42#issuecomment-242445398, or mute the thread https://github.com/notifications/unsubscribe-auth/AP1UCA4ZWEl7PxbxFpzzi3OfdGEELkjsks5qjb5pgaJpZM4G9-An .