The Sequelize query builder could accept SQL injections

j-a-m-l commented 6 years ago

Since it's using raw queries, but they aren't escaped properly: https://github.com/fix/ark-core/blob/575ae88505a1a92a648825f9069c2ff381e79280/packages/core-database-sequelize/lib/query-builder.js

It could be fixed using replacements, probably: http://docs.sequelizejs.com/manual/tutorial/raw-queries.html

faustbrian commented 6 years ago

Yeah but that is not a priority for devnet launch as it doesn't break anything so this is an easy fix. Feel free to send a PR for it, I am not going to spend any time on it until devnet as there are more important issues to be solved.

faustbrian commented 6 years ago

Quickly tested the Sequelize replacements for raw queries and they don't seem to work for PostgreSQL as the escaping character is hardcoded to ' for replacements as opposed to " what is needed.

https://github.com/sequelize/sequelize/blob/master/lib/sql-string.js#L91

Expected SELECT "blockId","serialized" FROM "transactions" LIMIT 5

Actual SELECT 'blockId','serialized' FROM 'transactions' LIMIT 5

spkjp commented 6 years ago

I created a branch where replacements are used for everything except for column and table names: https://github.com/supaiku0/core/commit/97c9ce26ef6ddd6093db9b7cc057fa3661a2e6fb

The result is a mix of double and single quotes: SELECT "serialized" FROM "transactions" WHERE "blockId" = '13149578060728881902'

The root issue is that Postgres does not accept single quotes on column and table names.

Wouldn't it be the easiest to ship a custom sql-string.js which can escape column and table names and use it instead of sequelize replacements?

faustbrian commented 6 years ago

The quoting issue is why I didn't end up using replacements as Postgres will be the primary database driver.

A custom sql-string.js would be nice if we could include it without forking or some hacky tricks.

zillionn commented 6 years ago

And why you're using quotes for table and column names?

faustbrian commented 6 years ago

Core sadly doesn't follow any standard naming conventions like database columns should be snake case - block_id instead of blockId which causes problems with Postgres as it will treat blockId as blockid without using quotes for "blockId".

zillionn commented 6 years ago

I see. There is this "quoteIdentifiers" option but not sure if it's a good thing to use it.

options.quoteIdentifiers | Boolean | optional default: true Set to false to make table names and attributes case-insensitive on Postgres and skip double quoting of them. WARNING: Setting this to false may expose vulnerabilities and is not recommended!

spkjp commented 6 years ago

Good point. I tried running with quoteIdentifiers off:

ALTER TABLE rounds ADD CONSTRAINT rounds_unique UNIQUE (publicKey, round);
[2018-06-23 11:34:09][ERROR]   : Unable to connect to the database 
SequelizeDatabaseError: column "publickey" named in key does not exist

Unfortunately, it looks like it triggers the issue @faustbrian mentioned. publickey instead of publicKey.

Sequelize's sql-string is based on mysqljs/sqlstring. See comments here https://github.com/sequelize/sequelize/issues/1132

Using the Sequelize version is not feasible without a lot of work, since they added a lot of dependencies.

Therefore I used the original sqlstring and modified it to allow double quoting, you can checkout my branch here: https://github.com/supaiku0/core/commit/046c2ec2790998203467205d38043d7437a35b8b

Please look if anything is missing or wrong. My relay is running without problems so far.

Btw. I'm not sure if the attribution should be moved into LICENSE or stay there. The file is MIT licensed.

zillionn commented 6 years ago

Oh, I misunderstand him. Yes, the quoteIdentifiers: false won't work if the column name is not lowercase -> publicKey.

And I guess renaming all tables and columns to lowercase is not an option?

sleepdefic1t commented 6 years ago

http://blog.lerner.co.il/quoting-postgresql/

Single quotes return text strings. Double quotes return (if you can really think of them as “returning” anything) identifiers, but with the case preserved.

If I understand, double is basically returning a label; and postgresql apparently doesn’t play nice w/upper case without using escaping and/or block-quote(as in @supaiku0’s fix).

Renaming might break backwards compatibility.

In Cpp, we might use some sort of preprocessor defines like:

#if defined(USE_X_FORMAT)
  #define \’blockid\’ “blockId”
  #define \’publickey\’ “publicKey”
#endif

No idea how that might work in js; but if possible, something similar may be an easier fix than an additional dependency. Basically just create constants for the labels in the format we need.

faustbrian commented 6 years ago

The problem with transformers is that they add a lot of overhead for thousands of records, that is why I didn't implement them as a solution for this.

Renaming the database columns is touchy because you would have to find every single instance in the code that uses transaction.blockId and replace it with transaction.blockId.

What is touchy about that is that not every transaction variable is a model, it can be either a raw object or a model from the packages/crypto/lib/models folder. A single mistake and the whole application could break since javascript doesn't exactly have the best type handling system or handling of empty variables which might result in no error being thrown but no data or wrong data being outputted.

If anyone wants to do the huge refactor of changing the columns to snake_case I am all for it as that is the right thing to do.

spkjp commented 6 years ago

@zillionn @faustbrian Anyone working on it (the renaming)? If not I'll try my luck.

faustbrian commented 6 years ago

Feel free to work on it.

spkjp commented 6 years ago

Ok, my first attempt looks promising. When one provides the query method with a fieldMap (taken from a model) the fields will be automatically matched (snake_case => camelCase).

Field map from Block model: screenshot_20180625_211748

DB result: screenshot_20180625_211724

Final result: screenshot_20180625_211637

There are more ways to achieve the field mapping via other options, like providing an instance or the model instead of a field map. Will see how that goes. Once I think I ironed out most of the bugs I'll open a PR for feedback and testing.

Related: Right now it's necessary to explicitily define an alias field for the column name in the model:

blockId: {type:..., field: 'block_id'}.

With Sequelize 5 a new underscore option becomes available which respects the Postgres naming convention and turns all camelCase to snake_case behind the scenes. Meaning that the explicit field annotations become obsolete.

Cheers.

faustbrian commented 6 years ago

I saw that feature coming in Sequelize 5 a few months ago but didn't give the beta a go because there were still inconsistencies with the postgres packages and syntax.

The problem with the models is that they are not really returned anymore because we replaced most or everything at this point with raw queries so that Sequelize does not process the data through models because that had a major impact on performance of the queries.

The queries would execute relatively fast while testing but the process of turning the objects into Sequelize models would take longer then the queries itself.

spkjp commented 6 years ago

Just to make sure, but the raw queries are still intact. I only added a new parameter: screenshot_20180626_060757

The fieldMap approach shouldn't impact the performance much if at all:

(From sequelize/lib/dialects/abstract.js) screenshot_20180626_055509

Returned objects are still 'raw' objects in the sense of not being constructed by a model constructor. Definitely interesting to see how it impacts performance once it runs properly.

zillionn commented 6 years ago

Does this also solve the SQL injection issue?

And doesn't fieldMap work only with SELECT?

ArkEcosystem / core

The Sequelize query builder could accept SQL injections #560