Closed j-a-m-l closed 6 years ago
Yeah but that is not a priority for devnet launch as it doesn't break anything so this is an easy fix. Feel free to send a PR for it, I am not going to spend any time on it until devnet as there are more important issues to be solved.
Quickly tested the Sequelize replacements for raw queries and they don't seem to work for PostgreSQL as the escaping character is hardcoded to '
for replacements as opposed to "
what is needed.
https://github.com/sequelize/sequelize/blob/master/lib/sql-string.js#L91
Expected
SELECT "blockId","serialized" FROM "transactions" LIMIT 5
Actual
SELECT 'blockId','serialized' FROM 'transactions' LIMIT 5
I created a branch where replacements are used for everything except for column and table names: https://github.com/supaiku0/core/commit/97c9ce26ef6ddd6093db9b7cc057fa3661a2e6fb
The result is a mix of double and single quotes:
SELECT "serialized" FROM "transactions" WHERE "blockId" = '13149578060728881902'
The root issue is that Postgres does not accept single quotes on column and table names.
Wouldn't it be the easiest to ship a custom sql-string.js which can escape column and table names and use it instead of sequelize replacements?
The quoting issue is why I didn't end up using replacements as Postgres will be the primary database driver.
A custom sql-string.js would be nice if we could include it without forking or some hacky tricks.
And why you're using quotes for table and column names?
Core sadly doesn't follow any standard naming conventions like database columns should be snake case - block_id
instead of blockId
which causes problems with Postgres as it will treat blockId
as blockid
without using quotes for "blockId"
.
I see. There is this "quoteIdentifiers" option but not sure if it's a good thing to use it.
options.quoteIdentifiers | Boolean | optional default: true Set to false to make table names and attributes case-insensitive on Postgres and skip double quoting of them. WARNING: Setting this to false may expose vulnerabilities and is not recommended!
Good point. I tried running with quoteIdentifiers
off:
ALTER TABLE rounds ADD CONSTRAINT rounds_unique UNIQUE (publicKey, round);
[2018-06-23 11:34:09][ERROR] : Unable to connect to the database
SequelizeDatabaseError: column "publickey" named in key does not exist
Unfortunately, it looks like it triggers the issue @faustbrian mentioned. publickey
instead of publicKey
.
Sequelize's sql-string is based on mysqljs/sqlstring. See comments here https://github.com/sequelize/sequelize/issues/1132
Using the Sequelize version is not feasible without a lot of work, since they added a lot of dependencies.
Therefore I used the original sqlstring and modified it to allow double quoting, you can checkout my branch here: https://github.com/supaiku0/core/commit/046c2ec2790998203467205d38043d7437a35b8b
Please look if anything is missing or wrong. My relay is running without problems so far.
Btw. I'm not sure if the attribution should be moved into LICENSE or stay there. The file is MIT licensed.
Oh, I misunderstand him. Yes, the quoteIdentifiers: false
won't work if the column name is not lowercase -> publicKey
.
And I guess renaming all tables and columns to lowercase is not an option?
http://blog.lerner.co.il/quoting-postgresql/
Single quotes return text strings. Double quotes return (if you can really think of them as “returning” anything) identifiers, but with the case preserved.
If I understand, double is basically returning a label; and postgresql apparently doesn’t play nice w/upper case without using escaping and/or block-quote(as in @supaiku0’s fix).
Renaming might break backwards compatibility.
In Cpp, we might use some sort of preprocessor defines like:
#if defined(USE_X_FORMAT)
#define \’blockid\’ “blockId”
#define \’publickey\’ “publicKey”
#endif
No idea how that might work in js; but if possible, something similar may be an easier fix than an additional dependency. Basically just create constants for the labels in the format we need.
The problem with transformers is that they add a lot of overhead for thousands of records, that is why I didn't implement them as a solution for this.
Renaming the database columns is touchy because you would have to find every single instance in the code that uses transaction.blockId
and replace it with transaction.blockId
.
What is touchy about that is that not every transaction
variable is a model, it can be either a raw object or a model from the packages/crypto/lib/models
folder. A single mistake and the whole application could break since javascript doesn't exactly have the best type handling system or handling of empty variables which might result in no error being thrown but no data or wrong data being outputted.
If anyone wants to do the huge refactor of changing the columns to snake_case
I am all for it as that is the right thing to do.
@zillionn @faustbrian Anyone working on it (the renaming)? If not I'll try my luck.
Feel free to work on it.
Ok, my first attempt looks promising. When one provides the query method with a fieldMap (taken from a model) the fields will be automatically matched (snake_case => camelCase).
Field map from Block model:
DB result:
Final result:
There are more ways to achieve the field mapping via other options, like providing an instance or the model instead of a field map. Will see how that goes. Once I think I ironed out most of the bugs I'll open a PR for feedback and testing.
Related: Right now it's necessary to explicitily define an alias field for the column name in the model:
blockId: {type:..., field: 'block_id'}.
With Sequelize 5 a new underscore option becomes available which respects the Postgres naming convention and turns all camelCase to snake_case behind the scenes. Meaning that the explicit field annotations become obsolete.
Cheers.
I saw that feature coming in Sequelize 5 a few months ago but didn't give the beta a go because there were still inconsistencies with the postgres packages and syntax.
The problem with the models is that they are not really returned anymore because we replaced most or everything at this point with raw queries so that Sequelize does not process the data through models because that had a major impact on performance of the queries.
The queries would execute relatively fast while testing but the process of turning the objects into Sequelize models would take longer then the queries itself.
Just to make sure, but the raw queries are still intact. I only added a new parameter:
The fieldMap approach shouldn't impact the performance much if at all:
(From sequelize/lib/dialects/abstract.js)
Returned objects are still 'raw' objects in the sense of not being constructed by a model constructor. Definitely interesting to see how it impacts performance once it runs properly.
Does this also solve the SQL injection issue?
And doesn't fieldMap
work only with SELECT
?
Since it's using raw queries, but they aren't escaped properly: https://github.com/fix/ark-core/blob/575ae88505a1a92a648825f9069c2ff381e79280/packages/core-database-sequelize/lib/query-builder.js
It could be fixed using replacements, probably: http://docs.sequelizejs.com/manual/tutorial/raw-queries.html