stefanwille / crystal-redis

Full featured Redis client for Crystal
MIT License
381 stars 61 forks source link

Clear command injections (failed cast) #122

Closed mixflame closed 2 years ago

mixflame commented 2 years ago

The failed cast command injection caused Amber Crystal server fracture.

Repair is rescue with default nil type.

kostya commented 2 years ago

All this direct casts expected to be as it is. It should not raise, if it raises, so there some bugs in usage, may be in the shard code. Adding rescue here quite dirty style. Can you provide examples of failing?

mixflame commented 2 years ago

It was caused by remote execution (RCE) in a customized redis adapter for PUBUSB. I don't have an example of how this is possible to exploit but I'm very sure that the exploit exists and this is the fix.

I think the idea is passing strings and getting Redis to mix data and cast improperly, like Nil to Int32. Executing command without raise is very dangerous if the returned data is casted incorrectly.

This is a known DoS on my Crystal server. Unhandled exception is fatal for Amber framework servers, it causes the server to exit the handling loop for HTTP connections.

I would suggest looking for an exploit in Crystal/Redis stacks related to what I'm talking about. I don't think it's related to structural issues because it's not something that fails at random.

Crystal/Amber still doesn't log enough that it can easily capture these type of RCE attempts, but it became obvious. "rescue" is "dirty" indeed but it requires getting dirty weapons out of the toolbox to defend against a dirty attack.

On Sat, Aug 28, 2021 at 1:55 AM Kostya M @.***> wrote:

All this direct casts expected to be as it is. It should not raise, if it raises, so there some bugs in usage, may be in the shard code. Adding rescue here quite dirty style. Can you provide examples of failing?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/stefanwille/crystal-redis/pull/122#issuecomment-907589881, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAND325W626NVXLIISKA3NLT7CI75ANCNFSM5CVEEHNQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

-- Jonathan Silverman tel: (720) 771 0547

kostya commented 2 years ago

To find it, you can add logging with rescue this exceptions, to see what method, and what result produce exception.

mixflame commented 2 years ago

To do that I would have to revert a patch I already know to be stablizing my server, and expect someone to commit the RCE on it.

On Sat, Aug 28, 2021 at 12:26 PM Kostya M @.***> wrote:

To find it, you can add logging with rescue this exceptions, to see what method, and what result produce exception.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/stefanwille/crystal-redis/pull/122#issuecomment-907669065, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAND32ZXVYT3Z2WFH4JK7LLT7ES4LANCNFSM5CVEEHNQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

-- Jonathan Silverman tel: (720) 771 0547

robacarp commented 2 years ago

I don't think this is a good change. Silently rescuing whatever failures happen and returning dummy values may help some codepaths but it's also likely to completely obscure other failures and would likely also break integration patterns across the whole system of folks using crystal-redis.

mixflame commented 2 years ago

This is correct. The real issue is that Redis injections exist and can overflow the command system. Not rescuing these within the adapter means failures within the host systems anyway. If there is a cleaner way, I'm all for it. The issue is that crystal-redis can be forced to run commands that don't make sense due to type casting of returned values automatically by crystal-redis.

On Mon, Aug 30, 2021 at 11:28 AM robacarp @.***> wrote:

I don't think this is a good change. Silently rescuing whatever failures happen and returning dummy values may help some codepaths but it's also likely to completely obscure other failures and would likely also break integration patterns across the whole system of folks using crystal-redis.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/stefanwille/crystal-redis/pull/122#issuecomment-908535303, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAND327BEJLBQ4BN5KIZMFLT7O5S7ANCNFSM5CVEEHNQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

-- Jonathan Silverman tel: (720) 771 0547

mixflame commented 2 years ago

On second thought, I thought of another attack vector. You can possible insert Redis data using normal means by editing input parameters to cause a failed cast on return by web server and crash the server. With this patch, you cannot.

mixflame commented 2 years ago

Here was the error that my solution fixed:

Aug 22 21:22:53 localhost openloft.cr[579548]: Unhandled exception in spawn: cast from Nil to Int64 failed, at /home/gbaldraw/gbaldraw.cr/lib/redis/src/redis/command_execution/value_oriented.cr:13:9:13 (TypeCastError)

kostya commented 2 years ago

can you show more lines from stacktrace?

mixflame commented 2 years ago

Hey @kostya This is the only error reported. There may be also crashes possible in the regular HTTP::Handler loops and crashes which do not produce log lines.

As in, there is no stack trace when this happens. I looked upstream in Amber and it doesn't look like the Unhandled exception is caught there.

kostya commented 2 years ago

i mean you try grab it like this:

def integer_command(request : Request) : Int64
  command(request).as(Int64)
rescue ex : TypeCastError
  File.open("/tmp/crash.log", "a") { |f| f.puts("#{Time.local} - #{ex.message.inspect} - #{ex.backtrace.inspect}") }
  raise ex
end
mixflame commented 2 years ago

Good but I don't know how to force the problem to occur. I don't have the exploit. The only way for me to test this is to use my live server as a shield with this applied just to get the stack trace and hope someone hits it.

kostya commented 2 years ago

Why are you sure that it was attack? May be it was just unexpected call from unexpected arguments, which produce exception. I see that amber inspect exceptions with backtrace which is quite expensive call (i saw some situations, with my webservers, even on production, that getting backtrace from exception, can spend something like 0.5s).

mixflame commented 2 years ago

That's what I think but my site is highly attacked so I know it's likely unexpected arguments from user input possibly on purpose but maybe on accident.

On Sat, Sep 4, 2021 at 3:19 AM Kostya M @.***> wrote:

Why are you sure that it was attack? May be it was just expected call from unexpected arguments, which produce exception. I see that amber inspect exceptions with backtrace which is quite expensive call (i saw that some situations, even on production, it can spend something like 0.5s, just to get backrace).

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/stefanwille/crystal-redis/pull/122#issuecomment-912940831, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAND32ZXGGHEPGIVI4QJFODUAHQCHANCNFSM5CVEEHNQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

-- Jonathan Silverman tel: (720) 771 0547

robacarp commented 2 years ago

I'm sorry to hear about your site crashing from unsanitized user input. That's a rough game to be in the middle of, no question. There are a few things that can be done to mitigate this, but I think the first pass would be to sanitize the user input. It's difficult to say what a great approach for this would be without knowing what you're doing with user input and redis.

The relational model for this is the prepared statement, but redis doesn't have anything like that. A crystal-level abstraction level on top of the redis driver seems like it would provide most of the protections you're after, and it might provide a place for a nicer DSL.

I don't think rescuing invalid responses is a good way to solve the issue of invalid user input. I do think the crystal-redis driver should be capable of receiving whatever responses a valid redis server sends back without a runtime crash, but that should be accurately reflected in the type grammar rather than just rescuing the cast with default values. If the redis server can validly return a nil then the proper way for the library to be written is for it to return a nilable type and let the implementer deal with the ambiguity.

kostya commented 2 years ago

Is i understand the problem correct? There is possible 2 kind of bugs: 1) wrong casting results from redis for some methods called with some arguments (just bug in crystal-redis), we can fix it if we can see backtrace. 2) someone store into redis key list data, but try to execute set, zset operation on it (purposly, attack). But why someone from internet can access to you redis keys? Your application should aware of all operations, and prevent it.

So if it is an attack(2) it should be bug in Amber? Also raising exception is for sure slow operation, but not so much. I think problem that amber take backtrace from exception for logging, and thats huge operation (from my experience, with custom tcp server), and can create high cpu load.

mixflame commented 2 years ago

@Kostya I was experiencing the issue through my site https://openloft.org open source BSD at https://github.com/mixflame/openloft.cr The crash happens because the Amber request loop is broken by exception. For in framework code this causes execution of backtrace handler, but for library code I think it breaks request loop.

@Robacarp Probably I can change this from your comments to be more Redistic and return a type that is more similar to what Redis outputs, so as to avoid disrupting test cases.

On Tue, Sep 7, 2021 at 1:50 PM Kostya M @.***> wrote:

Is i understand the problem correct? There is possible 2 kind of bugs:

  1. wrong casting results from redis for some methods called with some arguments (just bug in crystal-redis), we can fix it if we can see backtrace.
  2. someone store into redis key list data, but try to execute set operation on it (purposly, attack). But why someone from internet can access to you redis keys? Your application should aware of all operations, and prevent it.

So if it is an attack it should be bug in Amber? Also raising exception is for sure slow operation, but not so much. I think problem that amber take backtrace from exception for logging, and thats huge operation (from my experience, with custom tcp server), and can create high cpu load.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/stefanwille/crystal-redis/pull/122#issuecomment-914580339, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAND327XODDOSJJOBVESXULUAZUILANCNFSM5CVEEHNQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

-- Jonathan Silverman tel: (720) 771 0547

mixflame commented 2 years ago

@Robacarp I added a commit where each command can return nil if the result is nil. This opens up the way for nils to appear in returned objects outside or_nil functions.

kostya commented 2 years ago

In your branch this code return 0, but should raise error. In master this is raised, in Ruby its also raised. Correct behaviour here is raising, i think.

require "./src/redis"

r = Redis.new
r.lpush("a", "1")
p r.sadd("a", "1")
mixflame commented 2 years ago

@kostya Affirmative, you are correct. I will give it a second go around with fresh crystal-redis and isolate the error in my side of the code, and then we can learn what the vulnerability really means.

Edit: I have downloaded production database to local machine and will try to trigger and isolate the issue.

mixflame commented 2 years ago

I believe I solved it... I had a vulnerable HINCRBY that could be exploited by injecting data.

robacarp commented 2 years ago

@mixflame great to hear! It might be helpful to post the code before and after, or a diff, so that anyone who happens to trigger the same behavior accidentally and lands here through extensive google searching might be tipped off about the possible solution.

mixflame commented 2 years ago

I went through the Redis accessing code lines and this looked most suspicious.

# redis.hincrby("all_time", data["name"], 1) if data.has_key?("name") # exploitable by changing values

Someone got names instead of numbers inserted into the all_time table somehow incrementing them would break the server.

mixflame commented 2 years ago

Solution was:

On Fri, Sep 10, 2021 at 1:18 PM robacarp @.***> wrote:

@mixflame https://github.com/mixflame great to hear! It might be helpful to post the code before and after, or a diff, so that anyone who happens to trigger the same behavior accidentally and lands here through extensive google searching might be tipped off about the possible solution.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/stefanwille/crystal-redis/pull/122#issuecomment-917151038, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAND323Y7PACATU744OXDELUBJKZ5ANCNFSM5CVEEHNQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

-- Jonathan Silverman tel: (720) 670 2246