colinmarc / impala-ruby

an impala client for ruby
MIT License
34 stars 22 forks source link

Query sanitization removes newlines #23

Closed OlegSmelov closed 7 years ago

OlegSmelov commented 7 years ago

For example, queries like this

select id
from users -- some random comment
where id > 1000

are transformed into

select id from users -- some random comment where id > 1000

removing the where part of the query.

The problem seems to be in the sanitize_query method: https://github.com/colinmarc/impala-ruby/blob/e52e50d1bbdb882403f5bbf058dea6065817a3f6/lib/impala/connection.rb#L90-L96

> "test\r\ntest".split.join(' ')
=> "test test"
colinmarc commented 7 years ago

Out of curiosity, how does the impala shell treat comments? I'm not sure if there's a simple way to do this without introducing an SQL parser, but this does have some security implications.

OlegSmelov commented 7 years ago

I haven't used Impala shell, but single line comments in Hue are terminated by a newline.

colinmarc commented 7 years ago

I don't have an impala instance handy to test with - what happens if you remove the newline-cleaning? Does it handle multiline queries correctly? Maybe that was a mistake to begin with.

OlegSmelov commented 7 years ago

Removing newline cleaning makes multiline queries work as expected.

However, I'm not sure what kind of issues query sanitization was supposed to prevent. Maybe there's now some kind of exploit possible, I can't tell. 😄

OlegSmelov commented 7 years ago

I just realized there's a yet another side effect: if you have whitespace in strings, it removes it too:

> sanitize_query("select * from users where name = '   test    '")
=> "select * from users where name = ' test '"

So I guess we either need to rewrite sanitization, or remove it altogether.

colinmarc commented 7 years ago

I think the sanitize thing was just misguided. I'll remove it.

colinmarc commented 7 years ago

Ok, I pushed a new release - want to make sure it works for you?

OlegSmelov commented 7 years ago

Works for me 👍

Thank you for taking the time to fix this! 👏