agoragames / leaderboard

Leaderboards backed by Redis in Ruby
https://rubygems.org/gems/leaderboard
MIT License
480 stars 65 forks source link

Use multi for members_from_score_range_in #18

Closed jgadbois closed 12 years ago

jgadbois commented 12 years ago

I'm getting an error on Heroku when using the members from score_range in method in a loop. Here's what RedisToGo support said:

-ERR DISCARD without MULTI" error occurs when Redis is expecting a multi command that sets up a transaction block as you have implemented in other methods in the leaderboard.rb file. You may need to rewrite the members_from_score_range_in function to use the multi block that redis is expecting.

jgadbois commented 12 years ago

I implemented here, but it is not solving the problem:

https://github.com/jgadbois/leaderboard/commit/59b65ba37f39babbcc4b71b3c856da345478c7d9

I still get the -ERR DISCARD without MULTI error

jgadbois commented 12 years ago

Could calling the method in a tight loop be causing this problem? Maybe I need to wrap my whole method call in a transaction? I'm really confused as to why this is happening (doesn't happen locally, only on RedisToGo through Heroku)

jgadbois commented 12 years ago

This didn't seem to fix it - RedisToGo upgraded my Redis server to 2.4.x and that fixed the problem it seems

czarneckid commented 12 years ago

Glad that it was fixed with upgrading Redis. What was the version of Redis you were using where you could cause the issue? Was it pre-Redis 2.4.4?

jgadbois commented 12 years ago

Yes it was 2.2 I believe. I actually am still intermittently getting the error. I switched back to my branch and it seemed to start working again, but I don't know if adding the multi really helped or is just coincidental to my app restarting or something like that.

czarneckid commented 12 years ago

Is the error happening IN the members_from_score_range_in method or is it happening down in the ranked_in_list_in method? I can't see how it would be happening in the former until you drop down into the ranked_in_list_in method.

jgadbois commented 12 years ago

My call is to members_from_score_range, my stack trace doesn't give me the line number in the leaderboard gem

czarneckid commented 12 years ago

Right, members_from_score_range will drop into members_from_score_range_in. Shame that you don't have more of a stacktrace.

jgadbois commented 12 years ago

Yeah, it's frustrating.

czarneckid commented 12 years ago

You don't see a more full stacktrace when running heroku logs -t. If you get a chance and you can isolate a small test program, might be useful for all parties, Redis, Redis ruby client and us.

jgadbois commented 12 years ago

Yeah, I got my trace from tailing the heroku logs: Here's my method causing the issue that could be run standalone (although it doesn't break on my machine)

  def get_rank_for leaderboard, metric, record
    range = 50
    first = true

    if metric
      if(record.nil? || metric >= record)
        return 1
      else
        min = metric - range
        max = metric + range

        while (min > 0 || max < record || first)
          if(leaderboard.total_members_in_score_range(min, max) > 0)
            members = leaderboard.members_from_score_range(min, max) 
            members.each do |m|
              if m[:score] <= metric
                return m[:rank]
              end
            end

            return leaderboard.total_members + 1
          end

          min = min - range
          max = max + range
          first = false
        end
      end
    end
  end

Could something go wrong if there is a huge number of members returned in a given range?

czarneckid commented 12 years ago

I guess something could go wrong in terms of network connectivity breaking between Heroku and RedisToGo if there are a large number of results being returned, but that'd fall in the realm of the Ruby client. Are you able to run some range tests to see over a given range how that might be causing the issue? Does the error even happen at all if you run the test over and over for a small min and max?

jgadbois commented 12 years ago

I just tried running my method in a tight loop 1000 times with 10k members in redis and didn't have any problems. Not sure what I did would have tripped anything, but I loaded them with score = i/5.0 so a range of 100 would return 500 members.

czarneckid commented 12 years ago

Thanks for digging in further. Let me know if you see any odd behavior with a larger range. I've been able to submit 1,000,000 scores to the rank_members method, which does a transaction, but that was local testing.

jgadbois commented 12 years ago

I got some full stack traces in airbrake (unfortunately that means it's blowing up on production)

vendor/bundle/ruby/1.9.1/gems/redis-2.1.1/lib/redis.rb:565:in `multi'
vendor/bundle/ruby/1.9.1/gems/leaderboard-2.2.1/lib/leaderboard.rb:673:in `ranked_in_list_in'
vendor/bundle/ruby/1.9.1/gems/leaderboard-2.2.1/lib/leaderboard.rb:531:in `leaders_in'
vendor/bundle/ruby/1.9.1/gems/leaderboard-2.2.1/lib/leaderboard.rb:598:in `member_at_in'
vendor/bundle/ruby/1.9.1/gems/leaderboard-2.2.1/lib/leaderboard.rb:580:in `member_at'
lib/records/leaderboard_manager.rb:46:in `record'
app/models/workout_exercise.rb:84:in `sitewide_record?'
app/models/workout_exercise_set.rb:14:in `sitewide_record?'
vendor/bundle/ruby/1.9.1/gems/redis-2.1.1/lib/redis.rb:568:in `rescue in multi'
vendor/bundle/ruby/1.9.1/gems/redis-2.1.1/lib/redis.rb:565:in `multi'
vendor/bundle/ruby/1.9.1/gems/leaderboard-2.2.1/lib/leaderboard.rb:673:in `ranked_in_list_in'
vendor/bundle/ruby/1.9.1/gems/leaderboard-2.2.1/lib/leaderboard.rb:567:in `members_from_score_range_in'
vendor/bundle/ruby/1.9.1/gems/leaderboard-2.2.1/lib/leaderboard.rb:547:in `members_from_score_range'
lib/records/leaderboard_manager.rb:106:in `get_rank_for'
lib/records/leaderboard_manager.rb:85:in `workout_rank'
app/models/workout.rb:83:in `sitewide_rank'

Any ideas? Happy to do anything to help. I have pages blowing up all over now (my own fault for not testing more under load)

czarneckid commented 12 years ago

Any chance you could try using version 2.2.2 of the redis gem? It looks like there are some changes with multi/exec that may help here with this issue in that it should not raise any exceptions. I think the error here though is within Redis or the connection to Redis.

jgadbois commented 12 years ago

will do - actually just working on that as we speak, thanks

jgadbois commented 12 years ago

My app only has one REDIS connection - could that be causing a problem if there are a lot of users on?

czarneckid commented 12 years ago

We run with single connections to Redis in our production environment. But here it looks like here an error is being thrown by Redis in retrieving the data. I could be wrong.

Did updating to version 2.2.2 of the redis gem help any?

jgadbois commented 12 years ago

I deployed to production about 10 minutes ago and so far yes that seems to be helping. Not enough data yet though. Another potential cause is maybe if Rack::Timeout kicks in and kills the request while Redis is processing data? With the old client the page did seem to be loading abnormally slow - seems to be faster with the newer client version

czarneckid commented 12 years ago

What is Rack::Timeout set to? In our production environment, we use a timeout of 30 seconds for Unicorn. This could certainly be causing an issue if the Rack::Timeout value is set too small. Obviously if were pulling a large amount of data from Redis, there could be a timeout, but I'm not sure how much you're pulling via the members_from_score_range_in method.

Do you have a connection timeout set in redis.conf? By default, I believe it is set to 0 to be disabled.

jgadbois commented 12 years ago

Looks like redistogo sets timeout to 150

Our Rack::Timeout is also set to 30 seconds but it was going super slow even when I removed the members_from_score_range_in loop to get rank.

I'll keep digging, sorry this is getting somewhat off topic...I appreciate the help.

czarneckid commented 12 years ago

Sure, it's not a problem. Happy to help try and diagnose the issue. Sorry that it has to be in your production environment, but it'd be tough to replicate the environment to run any tests.

czarneckid commented 12 years ago

No news is good news? Any update?

jgadbois commented 12 years ago

I think upgrading the client fixed it. Not seeing the error anymore so far. It was still kind of slow (my algorithm?) so I added some fragment_caching to minimize hits to redis on this particular page. Overall I think it is going to work nicely. I have a leaderboard per exercise in our database (~1000) so should get some good use out of the gem :)

Thanks again and thanks for releasing this gem.

czarneckid commented 12 years ago

Good to hear. The differences between versions here would indicate that's the issue:

2.1.1 - https://github.com/ezmobius/redis-rb/blob/c60d5ab5aa2257e890cb047c69f0171db983a93b/lib/redis.rb#L565-570 2.2.2 - https://github.com/redis/redis-rb/blob/be3d7e0b4194fcb7e91bee9153daf6073dbc1268/lib/redis.rb#L1013-1017

That's my guess.

czarneckid commented 12 years ago

You might also get some performance gains here by utilizing the hiredis driver.

jgadbois commented 12 years ago

That definitely looks like the culprit although not the root issue I guess. I'll check out the hiredis driver - thanks for pointing that out.