chrisboulton / php-resque

PHP port of resque (Workers and Queueing)
MIT License
3.43k stars 759 forks source link

Multiple Workers = Lost jobs? #32

Closed roynasser closed 12 years ago

roynasser commented 12 years ago

As per a comment on issue #30, I'm opening this issue in order to help gather more info and find a fix for the bug.

I observed that when deploying multiple workers, some jobs dont get processed...

Basically the test I did was:

watch output and number of jobs, launch 5 workers. Enqueue 10 jobs. number of jobs in redis gets augmented to 10 (so 10 jobs queued), but only a few get really processed. Nonetheless, they all get pulled out of the queue...

So basically it is "loosing" jobs.

I will try and test more tomorrow, time permitting, and post back.

Chris said he had already noticed this, so if anyone else has noticed and/or fixed this, please let us know.

Imo this is crucial for any production environment, no? Is anyone using phpresque in a production environment? Are you loosing jobs? Maybe you havent noticed? Can you please check and revert?

thanks!

andrewjshults commented 12 years ago

We used php-resque in production at frid.ge and were running multiple worker and I can't remember running into a lost job issue. Our primary use was for sending notifications so it's quite possible that we did end up losing some jobs, but never as significant as you seem to be seeing (I'm forgetting exact numbers but iirc we were pushing about 1MM jobs/month). We shut everything down after our acquisition in July so, unfortunately, I can't check the logs any more.

Technically, resque is not a fully safe queue (according to Chris Wanstrath, the developer of the Ruby version - https://github.com/defunkt/resque/issues/93 ). It would be possible to use RPOPLPUSH (http://redis.io/commands/rpoplpush) to make it fully safe with some relatively external reworking (workers would need names/UUIDs to PUSH/POP into/from unique sets and there would need to be some interface to move failed jobs back into the main queue). In the short term, this could be a good way to investigate which worker is popping off the missing jobs (just PUSH everything into unique sets based on the worker and don't worry about cleaning up the processed sets).

roynasser commented 12 years ago

I made some preliminary tests... enabling all of the hooks, it looks like it never gets to even the "before fork" event... so it is probably the dispatcher agent that is popping something off the queue but not assigning a worker to it...

I'm going to risk saying it has something to do with the reserve job function (where there was some modification about is_array, is_object, etc).... I'm guessing that it is popping it out over there but not reading it properly and therefore not assignign it to anyone... I'll try and debug that a bit more once I have some time.

In time, let me pick your brain (if you dont mind), what were you using in order to keep resque active all the time, etc... any kind of watcher process? I'm using another project also called PHP-Daemon that is quite useful... the way it was implemented is also quite interesting. They have a file or memcache based process lock system. that way you can call the "runner" every X minutes from a cron job from many servers, and it wont start double-daemons for the same job as long as there is a valid (timestamped) lock still running...

Is there any similar solution or idea for resque? Just trying to build a system which is as "compact" and "failsafe" as possible...

thanks!

andrewjshults commented 12 years ago

We were using supervisord (http://supervisord.org/) to manage the resque processes. It handles starting/restarting the processes as well as log rotation, remote control (we could bulk restart the workers). It's written in python, has been around for a while and actually runs everything as a sub-process that allows it to know instantly if something dies (without having to use PID/lock files). This is pretty failsafe in keeping the resque process running (we were on AWS so instances randomly dying was pretty par for the course), but it doesn't address the time in between the job being popped off the queue and being fully processed. If you're alright with losing a job if the main resque process dies (e.g. server crash), then this was fairly bulletproof (in my experience). If you can't have any lost messages, we'll need to look into using RPOPLPUSH instead (I'm currently working on an implementation where losing messages isn't the end of the world, but it's a bit more critical than it was for a social network). I've got to get a few other base things up and running first, but am going to take a look into making a safer version of the worker runner (I may try to tweak the Sinatra frontend to handle the re-adding part rather than write a new frontend in PHP).

roynasser commented 12 years ago

I'm going to read up a bit more on supervisord... do you happen to have any config for what you used? I realized from the little ive investigated it is quite flexible...

thanks!

andrewjshults commented 12 years ago

Google technically owns all the frid.ge code, so I'm trying to "clean room" as much as possible. I'm trying to figure out how to get Redisent to work with pubsub right now, but the supervisord config stuff is a pretty straight forward ini file so this should get you started:

[group:workers]
programs=emailWorker,slowWorker

[program:emailWorker]
directory=/home/andrew/code
command=php vendor/php-resque/resque.php
numprocs=1
stdout_logfile=/home/andrew/logs/messagingWorker.log
stderr_logfile_maxbytes=10MB
redirect_stderr=true
autostart=true
autorestart=true
environment=ENVIRONMENT='prod',QUEUE='email',APP_INCLUDE='bootstrap_resque.php',REDIS_BACKEND='localhost:6379'

[program:slowWorker]
directory=/home/andrew/code
command=php vendor/php-resque/resque.php
numprocs=1
stdout_logfile=/home/andrew/logs/messagingWorker.log
stderr_logfile_maxbytes=10MB
redirect_stderr=true
autostart=true
autorestart=true
environment=ENVIRONMENT='prod',QUEUE='slow',APP_INCLUDE='bootstrap_resque.php',REDIS_BACKEND='localhost:6379'

Basically each worker is represented by its own entry. You can group them to make restarting easier (via supervisorctl) and pass in any environment variables as needed.

roynasser commented 12 years ago

I see! great, thats clears it up...

In your example, however, I assume the bootstrap_resque.php is basically loading the worker classes + resque library and launching worker instances, nonetheless you have defined the same QUEUE for both? is this correct?

I assume you should either have a different app_include for slow worker and emailworker, or have a different queue?

thks!

andrewjshults commented 12 years ago

Oops, updated my previous comment to reflect that those should be reading from separate queues.

The bootstrap_resque.php file is where your class autoloader should go (the resque.php is the one that's included from the repo and it handled loading up/starting the workers).

roynasser commented 12 years ago

Actually just going over again, resque.php is already called as the main program, and APP_INCLUDE would be the file with the worker classes, and their execution functions, correct?

So in your example I see to of the same? Should it not be either a different APP_INCLUDE (to process different functions), or a different QUEUE?

Also, when specifying QUEUE=* am I correct in thinking that it will prioritize automatically queues in alfabetical order? (i.e. process all items from queue A before any items from queue B?)

andrewjshults commented 12 years ago

I've never actually used the QUEUE=* functionality before, we had different machines run different queues (some machines had external email access and others has access to the image stores) so we had the specify them manually.

roynasser commented 12 years ago

Andrew, I have a question, I understand from your example that you used both the queue and the "class"/function mechanism to define the job function, correct?

i.e. you have an email queue which (i assume) would end up only getting job type e-mail, or something of the sort...

Why is it that you chose that?

What kind (if you are at liberty to discuss) of job got to slow queue? Was it just one type of job? My idea, at least initially (havent gotten down to many jobs yet, just 1 actually), would be to separate both jobs, and priorities... So I could have a notification e-mail which is priority B, and a "reply" e-mail which is priority A (hope this makes some sense)...

The same for slow tasks, which may be priortiy/queue C, I could have "image convert", "log rotate" and a couple more which are "less important" (i.e. to be done after the above, or perhaps on different machines)...

I'd like to understand further your use so I can perhaps learn of caveats and such before we put our system in production.

Anyways, we have 1 job working now... next week I will try and debug the "lost jobs" as that really isnt something I want to have to deal with (if a job doesnt go through I need to at least know, and right now it seems it is being "lost in translation" :p)

best rgds and have a great weekend

andrewjshults commented 12 years ago

The supervisord config file is not from an actual production environment, so those aren't representative of the actual queues that we were using (we did have an email queue but didn't have a slow queue).

We did use the queue + job (class) to distribute the load. We used queues so that we distribute the work to specific servers (if you haven't read through Github's Introducing Resque it provides a good overview of what their considerations were while building it). If you don't need to run specific jobs on specific servers, then queues can definitely be used for priorities.

The way we handled making sure that high priority events were taken care of first was to put those types in their own queue and assign workers just to that queue. For example, if you had both transactional emails (welcome, forgot password, etc.) and notification emails (new activity, replies, etc.) you could put all the transactional email jobs in their own queue, all the notifications in another and when you were starting a worker either assign one specifically to the transactional emails or put it first in the list of queues the worker is working on (the worker looks for items from the queues in the order that they are entered).

Let me know if that makes sense/if there is anything else I can help with.

roynasser commented 12 years ago

[EDIT: just deleted the posts above which are now obsolete as the root cause seems to be what is described below...]

OK, bear with me guys (if indeed anyone is reading this...)

It seems that Redisent is at fault after all...

Loot at what the Resque::pop function is outputting with this minor modificaiton (to output a debug string with whatever it will json_decode)...

(What I'm doing is just an "echo $item." - will be decoded\n";" inside the pop function in lib/Resque.php)

1 - will be decoded OK - will be decoded OK - will be decoded OK - will be decoded {"class":"JB","args":[{"time":1325031634,"array":{"test":"test1"}}],"id":"7a66a5a0088ee05b261328ae5bbab292"} - will be decoded OK - will be decoded 1 - will be decoded {"class":"JB","args":[{"time":1325031635,"array":{"test":"test1"}}],"id":"8c77f2fc6acb3e948e918c4467815fc4"} - will be decoded 1 - will be decoded 1 - will be decoded 1 - will be decoded {"class":"JB","args":[{"time":1325031636,"array":{"test":"test1"}}],"id":"aeecf21f6822a47ec54b5232bfdc7b25"} - will be decoded OK - will be decoded 1 - will be decoded 1 - will be decoded 1 - will be decoded 1 - will be decoded {"class":"JB","args":[{"time":1325031639,"array":{"test":"test1"}}],"id":"4293a76f5f668b6e47c23e16c2eefac7"} - will be decoded 1 - will be decoded {"class":"JB","args":[{"time":1325031640,"array":{"test":"test1"}}],"id":"b49c8689cf0f5fc3c9592d37bada0157"} - will be decoded {"class":"JB","args":[{"time":1325031639,"array":{"test":"test1"}}],"id":"e4a180be49b0a27b635a77c8ef441e35"} - will be decoded OK - will be decoded 1 - will be decoded 1 - will be decoded {"class":"JB","args":[{"time":1325031641,"array":{"test":"test1"}}],"id":"cac4e3b708fbf0d9c84fcf5c47bf1f85"} - will be decoded 131 - will be decoded 130 - will be decoded 2 - will be decoded 2 - will be decoded 1 - will be decoded {"class":"JB","args":[{"time":1325031643,"array":{"test":"test1"}}],"id":"95f811598537844306e4bc4bad5897f5"} - will be decoded 3 - will be decoded 1 - will be decoded

The above was supposed to be just a load of different JSON strings with - will be decoded appended... Nonetheless, whenever you have more than one command happening at the same time, or close togehter (this is the only way I got to reproduce it), Redisent outputs these weird numbers, or an OK, instead of the actual response from Redis...

At this point I may have a slight look but in all honesty I dont see the benefit... for anything worth having a message queue in place I dont see why not install phpredis, so I'll try and go that route.

I hope someone does find out what is wrong, but anyways, it is really annoying and weird, but i guess it is what you get with layer upon layer of code abstracting code... phew... at least I got down to it... admiteddly I could have started at the pop function....oh well...\

PS: I'm guessing a bunch of the json_decode errors come from feeding json_decode with, err, not json as per the above....

roynasser commented 12 years ago

Update: After partially switching to phpredis (I didnt re-do all of the parts, just the central bits as I wanted to chase down this issue), I've come to the conclusion that unfortunately the issue seems to not be in the phpredis or redisent aspect... Somethings somewhere in the multilayered way php-resque is built is breaking things, perhaps an issue with the forked redis connections? (although weird as it isnt always the same forked PID failing to obtain data)...

Apparently the "lost jobs bug" only shows itself when multiple workers are used (although this isnt known for sure... another user spotted this with multi workers, but I dont see any mentions of extensive testing with single workers to rule the bug out...) Anyways, I need something relatively dependable and therefore wont be able to use php-resque for now. If I do have time in future to look into the issue I may, but for now I'll need to look into a solution that works... Its a pitty as I wanted something with the simplicity of php-resque, using a redis backend for consistency which is already available in our stack, etc...

Oh well... if anyone has any suggestions they are appreciated. Otherwise good luck to those who continue on ;) And do beware as the behaivour for "losing" jobs is quite erratic... a simple "echo" on the pop function will show you that despite being popped off the Redis queue list, (I was following the server with monitor), phpredis or redisent are not receiving the json payload for the job correctly... At some points the payload comes in as an "Object" (php object), at others it comes in as a "+OK" (or :OK in Redisent) (which would indicate that the PHP<>Redis wrapper is crapped out somewhere, at times we get just a number like 1 (which I have no idea where it may come from as it wasnt in the queue as I was monitoring it), and at other times (perhaps 70 - 75% of the time), we would get the correct json payload... Honestly, unless it is just "signalling" most applications will probably not be happy with this sort of message loss.

If anyone can recommend some other project I'd be greatly appreciatvie... I'd prefer to stear clear of multiple daemons, etc, but would ultimately need something that can run PHP worker classes (I already have a few from php-resque so porting ease is a plus), multiple parallel worker per job and per priority, some form of queue security - to know which jobs failed, why, and perhaps restart them if the failure was momentary? etc... Thanks for all your patience throughout this saga, unfortunately it beat me.... time constraints for the current project as well as my upcoming vacation (which will be a bother in terms of coding), have conspired against php-resqeu.

salimane commented 12 years ago

I'm a bit interested in this because I'm also using Redisent in some apps. I haven't come across this (may be not yet). So i wanna ask you some about this. first are you sure it's really Redisent the culprit ? can you check your redis server version ? can you check if your server is properly configured and is behaving correctly ? which redisent version are you using ? you can have a look here at the version I'm using https://github.com/salimane/php-resque/tree/master/lib/Redisent and try and check if you can still reproduce the same error...

roynasser commented 12 years ago

Salimane, I copied your redisent.php and your Resque.php just to make sure... Same thing over here... I'd love for you to prove me wrong and it be something with my system... (who knows??)

Anyways, if youre willing to try it,

edit Resque.php around line 85, just before return json_decode($item, true);

add:

if ($item[0] == "{"){
    echo "obtained json!\n";
}else{
    echo "\nExpected JSON, instead obtained $item\n";
}

now run with multiple workers and enque some stuff... If you add things to queue and then start the workers it becomes even easier to see, but if you add stuff to the queue fairly quickly (like 2 per second, or 3 every 2 seconds) you should be able to make it cough up pretty quickly... I was able to get the results below with only 3 workers. More workers = more messages lost.

Sample output:

(obtained json means it got the correct job payload, and Expected JSON instead XXX means it didnt get the job, but the job was removed from the queue and lost forever... it isnt marked as failed or anything either)

[root@NaWebTeste Jobs]# php startjob.php 
*** Starting worker NaWebTeste:32605:default
*** Starting worker NaWebTeste:32606:default
*** Starting worker NaWebTeste:32608:default
[root@NaWebTeste Jobs]# 
Expected JSON, instead obtained 1

Expected JSON, instead obtained OK
obtained json!

Expected JSON, instead obtained 1
obtained json!

Expected JSON, instead obtained 1

Expected JSON, instead obtained 1
obtained json!

Expected JSON, instead obtained 1

Expected JSON, instead obtained 1
obtained json!

Expected JSON, instead obtained 1

If you go up to about 20 workers you will see the whole thing go crazy... for some reason it tries to pass the worker names as queue names and then the entire thing just self-destructs... (Thats when I was able to reproduce a whole bunch of the json_decode fatal errors)...

just an idea of what comes out - in this case, I didnt even get to add jobs, php-resque has got weird stuff going on even before it gets to the queue... it is apparently calling the pop function in weird places... It might well be what is happening at a smaller scale when we lose jobs: (I added a print_r($item) just so we could understand why the json_decode was complaining, makes absolutely no sense to me why it is being sent that array...)

Warning: json_decode() expects parameter 1 to be string, array given in /usr/www/GAQhomologa/API/v1.1/NewLibs/Jobs/lib/Resque.php on line 91

Expected JSON, instead obtained 1
1
Expected JSON, instead obtained Array
Array
(
    [0] => NaWebTeste:32608:default
    [1] => NaWebTeste:32752:default
    [2] => NaWebTeste:302:default
    [3] => NaWebTeste:321:default
    [4] => NaWebTeste:312:default
    [5] => NaWebTeste:32747:default
    [6] => NaWebTeste:314:default
    [7] => NaWebTeste:32767:default
    [8] => NaWebTeste:344:default
    [9] => NaWebTeste:309:default
    [10] => NaWebTeste:345:default
    [11] => NaWebTeste:328:default
    [12] => NaWebTeste:338:default
    [13] => NaWebTeste:339:default
    [14] => NaWebTeste:313:default
    [15] => NaWebTeste:349:default
    [16] => NaWebTeste:32606:default
    [17] => NaWebTeste:32754:default
)

Warning: json_decode() expects parameter 1 to be string, array given in /usr/www/GAQhomologa/API/v1.1/NewLibs/Jobs/lib/Resque.php on line 91

Expected JSON, instead obtained Array
Array
(
    [0] => NaWebTeste:32747:default
    [1] => NaWebTeste:314:default
    [2] => NaWebTeste:32767:default
    [3] => NaWebTeste:344:default
    [4] => NaWebTeste:309:default
    [5] => NaWebTeste:345:default
    [6] => NaWebTeste:328:default
    [7] => NaWebTeste:338:default
    [8] => NaWebTeste:339:default
    [9] => NaWebTeste:302:default
    [10] => NaWebTeste:312:default
    [11] => NaWebTeste:321:default
    [12] => NaWebTeste:313:default
    [13] => NaWebTeste:349:default
    [14] => NaWebTeste:32750:default
    [15] => NaWebTeste:32752:default
    [16] => NaWebTeste:32608:default
    [17] => NaWebTeste:32754:default
)

Warning: json_decode() expects parameter 1 to be string, array given in /usr/www/GAQhomologa/API/v1.1/NewLibs/Jobs/lib/Resque.php on line 91

Expected JSON, instead obtained 1
1
Expected JSON, instead obtained 1
1killall -9 php
Expected JSON, instead obtained 1
1
Expected JSON, instead obtained Array
Array
(
    [0] => NaWebTeste:32748:default
    [1] => NaWebTeste:32605:default
    [2] => NaWebTeste:32606:default
    [3] => NaWebTeste:32752:default
    [4] => NaWebTeste:32608:default
)

Warning: json_decode() expects parameter 1 to be string, array given in /usr/www/GAQhomologa/API/v1.1/NewLibs/Jobs/lib/Resque.php on line 91

Expected JSON, instead obtained 1
1
Expected JSON, instead obtained OK
OK
Expected JSON, instead obtained OK
OK
Expected JSON, instead obtained OK
OK
Expected JSON, instead obtained 1
1
Expected JSON, instead obtained OK
OK
salimane commented 12 years ago

could you try 1 worker per process

roynasser commented 12 years ago

Hi Salimane,

1 worker per queue seems to work fine... I havent been able to reproduce any missed jobs with 1 worker only... Nonetheless, I'm a bit skeptical about reliability on 1 worker per queue, as well as amount of work that can be handled... Also, if we use 1 worker on several servers, maybe the problem will occur again (although i dont think so)...

Is your plan to utilize only 1 server, 1 worker or X servers and X workers?

Thanks

salimane commented 12 years ago

no i mean 1 worker per process but many processes. let's say you run your worker with the command "php resque.php > worker_a.log & ". make sure the count of worker in resque.php is 1 but you run that command many times like 10. then you will have ten workers but 1 worker per process.

roynasser commented 12 years ago

Oh i see... makes sense... Can't get it to work using your example (although it should make sense?)

Each job stops the other.... :/ and if I only do one, when I try to tail the log to check it it just stops the process and keeps log emtpy.... never processes anything... first time i call startjob from the command line without any & or > log, it just processes everything....

[root@NaWebTeste Jobs]# php startjob.php > /root/1.log & [1] 1111 [root@NaWebTeste Jobs]# php startjob.php > /root/2.log & [2] 1112

[1]+ Stopped php startjob.php > /root/1.log [root@NaWebTeste Jobs]# php startjob.php > /root/3.log & [3] 1113

[2]+ Stopped php startjob.php > /root/2.log [root@NaWebTeste Jobs]# php startjob.php > /root/4.log & [4] 1114

[3]+ Stopped php startjob.php > /root/3.log [root@NaWebTeste Jobs]#

its quite late so i'm about to go to bed, tomorrow i can check a bit more on this... multi worker this way may be the last way out...

roynasser commented 12 years ago

Salimane, starting several workers with count=1 (i.e. several php processes each with one worker), works as expected and does not lose jobs...

I may take this approch for the time being...

Thanks for the suggestion! :)

chrisboulton commented 12 years ago

Sorry all, having been traveling for the past month I haven't been able to give any attention to this.

RVN-BR - I still want to investigate this. Do you happen to have a copy of all of your test scripts/jobs that you can send me? chris@bigcommerce.com would be the best address.

chrisboulton commented 12 years ago

I'm not sure how far you got, but I think I'm now seeing this same issue when I'm using more than one worker with the COUNT environment variable on the built-in resque.php, and QUEUES=*. Explains why I don't see this in production, as we don't use resque.php, have a script that forks for workers, or check all queues at once.

It looks like Redis is returning a multi-bulk reply on one or more of the workers. This is what's causing the array to be returned and everything else to fall apart. It's even doing this WITHOUT any jobs on the queue.

Digging further, here's what's happening with Redisent/php-resque

*\ Sleeping for 5 seconds

Sent:

*2
$8
SMEMBERS
$13
resque:queues

Received:

*1
$7
default

*\ Checking default

Sent:

*2
$4
LPOP
$20
resque:queue:default

Received:

*1
$7
default

So, what's happening is for some reason, sometimes we're getting back the multi-bulk response for the command to fetch the list of queues when we're trying to do an LPOP.

I'm still unable to establish WHY this is happening, but all I can attribute to, is a PHP bug with forking and sockets.

roynasser commented 12 years ago

Hi Chris, I dont have many files that would be of use to you, unfortunately... I mostly added ECHOs in different places, as well as ran a screen with a redis prompt running the monitor command so I could see what was going into and back from Redis... my conclusions are pretty similar to yours...

I'm unsure if it is a problem with sockets per se when forking, but it may be so... I tried also using phpredis instead of redisent, I assume it is also using sockets... I wonder how else one could connect to redis to debug/clear this error...

it also seems to be quite random in happening, at least i was unable to establish a pattern...

chrisboulton commented 12 years ago

The one thing I'm yet to do is look at the responses back from redis itself, as I ran out of time last night. Tonight I'll hopefully have the opportunity to sit on the end of a tcpdump, and hopefully at least rule redis out once and for all.

hSATAC commented 12 years ago

Exactly. I tried COUNT=1 and multi processes they worked fine. But when COUNT>1 this situation randomly appears. I'll close my issue and follow this thread.

roynasser commented 12 years ago

Hi Chris,

I have, unfortunately, been unable to progress on this issue... I'll try some more a bit later, but for now I need to continue on work projects... Do let me know about your findings. Werre you able to rule out redis?

salimane commented 12 years ago

Please check this pull request. seems to have solve this problem stated here Thanks

roynasser commented 12 years ago

As commented on the pull request, i dont think this fixes this particular issue. More comments including the tests that were concluded (and failed) are described in the pull request. Just posting here so ppl dont get confused.

salimane commented 12 years ago

I've added another commit to pull 43 , please check if it solves the issue. thanks

chrisboulton commented 12 years ago

All,

Thanks for the eager work on this issue and I'm sorry for not being able to respond sooner.

I've just gone ahead and committed a bit of a different fix for this in ebe76658175a7a8c9f190db6747a91f0c91eb988. Same concept as @salimane's fix, just implemented slightly different.

The testing I've completed shows the issue is resolved using my updated code, but if it doesn't work for you then let me know.

hSATAC commented 12 years ago

I tried this patch it seems ok. Thx for your update! I'll report if there's any new problem.

w3z315 commented 7 years ago

@chrisboulton This seems to be still an issue, we're getting this behavior too.

w3z315 commented 7 years ago

So it seems like there must be a bug within Redisent, as soon as we replace that with Predis everything works totally fine.

danhunsaker commented 7 years ago

Given we haven't actually used Redisent for quite some time, that's hardly surprising. Credis does seem to still have some issues inherited from Redisent, though, especially at the version we've been locked to of late. Predis seems likely to be even more stable than that. Which is why switching to Predis (or, better, to be Redis-library-agnostic) is such a high priority.

w3z315 commented 7 years ago

@danhunsaker I'm actually right now migrating everything to redis and PSR-4

danhunsaker commented 7 years ago

I take it by "everything" you mean this entire library. Which ... seems like a decent amount of work. Starting from master, or the latest tag?

w3z315 commented 7 years ago

The latest tag actually, and I'm making progress see my fork :-)

danhunsaker commented 7 years ago

Oh. Uh, don't do that. That's a terrible idea. The latest tag is several years - and several bugfixes, many of them major ones - old. You definitely wanna start from master.

w3z315 commented 7 years ago

@danhunsaker So basically everyone that check out the current project via composer is using a very old version?

danhunsaker commented 7 years ago

Yes. As has been discussed in countless other issues opened in the time since then, the latest tag is quite old, and quite a lot of commits have been made since, but despite fixing a great number of bugs and adding or enhancing a few features, no new tags have been added. Momentum has picked up again of late, but the library's not quite to a new tagging point yet. It will be, soon. In the meantime, users are, unfortunately, forced to require the dev-master version, or the ID of the specific git commit they're working against. It's not ideal, but it's the situation we're in at the moment.

yccheok commented 7 years ago

@amahrt Recently, we stuck at old 1.2 tag, and old PHP 5.3.3 and hit by this bug on high traffic server.

We are looking for possible a quick hot fix.

May I know, currently, do you use the following code in any of production - https://github.com/amahrt/php-resque/commits/bugfix/predis ?

mbdwey commented 7 years ago

+1 ENV : docker [redis server,2 workers] each worker got 50 fork and after testing 1000 task lose between 200 : 400 task each run