JetBrains / Exposed

Kotlin SQL Framework
http://jetbrains.github.io/Exposed/
Apache License 2.0
8.29k stars 690 forks source link

Postgres column names are case sensitive (uppercase / lowercase) #683

Open bodiam opened 4 years ago

bodiam commented 4 years ago

Hi all,

I'm upgrading from exposed 0.12.1 to 0.17.7, and suddenly my tests are failing. This is because of the casing of the columns. I've reverted back to our old Exposed.

In our Exposed classes, our columns are mapped in upper case:

    val G: Column<Int> = 
            integer("G")

but when executing the query, Exposed is doing an insert using a lower case g:

SQL: [INSERT INTO raw_data.raw ("g") VALUES (?) 

Which results in an error:

Caused by: org.postgresql.util.PSQLException: ERROR: column "g" of relation "raw" does not exist

This doesn't happen for all our columns, we also have columns with spaces in it, which seem to be unaffected.

MarkusKramer commented 4 years ago

I ran into a related issue with exposed 0.17.7 on H2 (mysql mode). Column name "something" is converted to "SOMETHING" but "someThing" is left as "someThing". There is some weird conversion logic in Database.kt#quoteIdentifierWhenWrongCaseOrNecessary

bodiam commented 4 years ago

Hi Markus,

I'm sorry to hear you're affected by a similar issues. It seems however that there's not a lot of feedback happening on Exposed. I hope you have more luck.

Cheers,

Erik

On 3 Jan 2020, at 20:46, Markus Kramer notifications@github.com wrote:

 I ran into a related issue with exposed 0.17.7 on H2 (mysql mode). Column name "something" is converted to "SOMETHING" but "someThing" is left as "someThing". There is some weird conversion logic in Database.kt#quoteIdentifierWhenWrongCaseOrNecessary

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

latyika94 commented 4 years ago

I have the same problem with exposed 0.21.1, with both table and column name also.

Our table definition is:

object ServiceDeskTable : IntIdTable("AO_54307E_SERVICEDESK") {
    val projectId = long("PROJECT_ID").nullable()
    val disabled = bool("DISABLED").nullable()
    override val id: Column<EntityID<Int>> = integer("ID").entityId()
}

And when i run query it's generate this:

[Exposed] Transaction attempt #1 failed: org.postgresql.util.PSQLException: ERROR: relation "ao_54307e_servicedesk" does not exist
      Position: 130. Statement(s): SELECT ao_54307e_servicedesk.id, ao_54307e_servicedesk.project_id, ao_54307e_servicedesk.disabled, ao_54307e_servicedesk.id FROM ao_54307e_servicedesk WHERE ao_54307e_servicedesk.project_id = ?
org.jetbrains.exposed.exceptions.ExposedSQLException: org.postgresql.util.PSQLException: ERROR: relation "ao_54307e_servicedesk" does not exist
  Position: 130
SQL: [SELECT ao_54307e_servicedesk.id, ao_54307e_servicedesk.project_id, ao_54307e_servicedesk.disabled, ao_54307e_servicedesk.id FROM ao_54307e_servicedesk WHERE ao_54307e_servicedesk.project_id = ?]
    at org.jetbrains.exposed.sql.statements.Statement.executeIn$exposed_core(Statement.kt:63)
ib-jamesp commented 4 years ago

@Tapac, I see this is assigned to you. Is there a conversation about a fix happening, or what needs to be done to fix? Not sure if I can be of any assistance, but would help where I can.

A quick, and uneducated look at the source, and I thought of adding an optional parameter to Table() to take the name provided, such as:

open class Table(name: String = "", useExplicitName: Boolean = false)

Then changing

fun nameInDatabaseCase(): String = if(useExplicitName) {tableName} else {tableName.inProperCase()}

I see that Exposed uses the database's metadata in inProperCase, so not sure of the original strategy.

Tapac commented 3 years ago

Hello everyone, sorry for the long silence but I miss the issue somehow.

Can someone describe your expected behavior for different cases of table/colums naming in Postgres? As I can see from documentation:

Key words and unquoted identifiers are case insensitive.

So it's doesn't matter how you map them in Exposed: 'MyTable", "myTABLE", or "MyTaBlE".

The opposite thing is when you have already defined a table with quoted name (that makes identifiers case sensitive), in that case you have to quote the table name in Exposed mapping too.

Maybe I miss something but checking a few combinations with both tables and columns (with and without quotes ) against Postgresql are finished with success.

I'm not sure but maybe the behavior was affected by other fixes, can you please check against the latest version?

jorpic commented 3 years ago

You can explicitly put quotes around case sensitive names:

object MyTable :  Table("\"MyTable\"") {
    val someField = text("\"someField\"")
}

Generated queries preserve those quotes:

INSERT INTO "MyTable" ("someField") VALUES ($1)
maxwell-carbon commented 3 years ago

Be careful if you put quotes in your tablenames -- identifiers that Exposed generates, like constraint names for foreign keys, will try to include those quotes verbatim, which breaks horribly, e.g. fk_"MyTable"_id.

AlexeySoshin commented 1 year ago

Just to clarify, for those that read this thread in 2023+ The issue @maxwell-carbon refers to was fixed in #1562

utybo commented 1 year ago

Is there any way to fix the root issue here? I'd like to use my table objects with both MySQL and PostgreSQL, but as it stands with the quoting workaround, quoted names (e.g. db."MyTable") do not work with MySQL

bog-walk commented 1 year ago

Hi @utybo I'm attempting to narrow down the goal of this issue and would appreciate any clarification regarding your comment about using a table object across both PostgreSQL and MySQL.

If, for example, I assume that you're using a basic table like so to retain identifier casing and quotes:

object Tester : Table("\"MyTable\"") {
    val col = integer("\"mYcOlUmN\"")
}

// Stored internally in:
// PostgreSQL -> MyTable, mYcOlUmN
// MySQL -> "MyTable", "mYcOlUmN"

In PostgreSQL, generated SQL doesn't alter anything and produces an identifier like "MyTable"."mYcOlUmN". In MySQL, the backtick is required as the identifier quote character, so this should be produced `"MyTable"`.`"mYcOlUmN"`.

Is it Exposed or a migration script that is producing db."MyTable" without backticks in MySQL?

Is the expectation that MySQL should treat escaped quotes as an indicator to replace with backticks? To produce this instead `MyTable`.`mYcOlUmN` ? And, if so, what if the user actually does want to include the double-quotes character?

As an alternative, if the quoting workaround is not used:

object Tester : Table("MyTable") {
    val col = integer("mYcOlUmN")
}

// Stored internally in:
// PostgreSQL -> mytable, mYcOlUmN
// MySQL -> MyTable, mYcOlUmN

In PostgreSQL, the identifier produced is mytable."mYcOlUmN". In MySQL, this produces an identifier MyTable.mYcOlUmN, which is equivalently accepted as `MyTable`.`mYcOlUmN`.

Is the expectation here that Exposed should not auto-quote the column name to retain user-casing in PostgreSQL?

But if both identifiers were sent to the databases as is, without any processing by Exposed, MySQL would store them unchanged, while PostgreSQL would auto-fold to lower case (losing case sensitivity on the column name). So the only way to keep an equivalent identifier across both databases would be to use a single lower case or to quote it.

Or is the point that users should have the option to send the identifiers to the database unchanged, exactly as defined in the table object, and any issues that arise from case-processing within the database will be resolved by the user?

If I've missed the point or the desired expectation, I'd appreciate any example that shows a table mapping, what the expected vs actual behavior is, and, if applicable, what migration tool is used. All tests running on the older examples above are passing with no exception (on version 0.41.1+).

utybo commented 1 year ago

Hi @bog-walk, the main idea here was to generate Exposed table objects that work with both Postgres and MySQL. My apologies, I really did not word that reply correctly.

The situation is that a table, within the database, is named MyTable (in this case, Prisma is creating this table while forcing the casing to match the desired schema, which is why we end up with uppercase letters in the database).

I'd like to create an Exposed table object that would be able to access this, but I'd like said table to work with both Postgres and MySQL.

Is the expectation that MySQL should treat escaped quotes as an indicator to replace with backticks? To produce this instead MyTable.mYcOlUmN ? And, if so, what if the user actually does want to include the double-quotes character?

This was more or less my expectation, but there is a pretty easy workaround for the MySQL driver to work like this. Simply add ?sessionVariables=sql_mode=ANSI_QUOTES to force MySQL to interpret " as the escaping character instead of backticks.

Or is the point that users should have the option to send the identifiers to the database unchanged, exactly as defined in the table object, and any issues that arise from case-processing within the database will be resolved by the user?

I think I'd prefer an option somewhere to tell Exposed to escape the table name. Ideally, I'd like some option so when I say : Table("MyTable") in my Exposed definitions, I really mean the table named MyTable with that casing. This would probably involve some escaping based on the configured escape character for the connection -- I do not know anything about JDBC internals so I can't say how to do that.

Here is a table of what works and what doesn't at the moment, considering a table literally named MyTable in the database:

Database \ Table object definition : Table("MyTable") : Table("\"MyTable\"")
MySQL
Postgres
MySQL with sqlMode=ANSI_QUOTES

Here's a repro sample using Testcontainers for creating dbs on the fly:

Repro sample
Not the best code I've ever written, but this is just a hack to test this quickly ```kotlin object MyTableUnquoted : Table("MyTable") { val someId = integer("someId") } object MyTableQuoted : Table("\"MyTable\"") { val someId = integer("someId") } @Testcontainers class MysqlReproTest { @Container val mysql = GenericContainer("mysql:8.0").apply { withExposedPorts(3306) withEnv("MYSQL_ROOT_PASSWORD", "root") withEnv("MYSQL_DATABASE", "db") } lateinit var mysqlDb: Database lateinit var mysqlDbAnsiQuotes: Database @BeforeEach fun initDb() { mysqlDb = Database.connect( "jdbc:mysql://${mysql.host}:${mysql.firstMappedPort}/db", driver = "com.mysql.cj.jdbc.Driver", user = "root", password = "root" ) mysqlDbAnsiQuotes = Database.connect( "jdbc:mysql://${mysql.host}:${mysql.firstMappedPort}/db?sessionVariables=sql_mode=ANSI_QUOTES", driver = "com.mysql.cj.jdbc.Driver", user = "root", password = "root" ) transaction(db = mysqlDb) { TransactionManager.current().exec("CREATE TABLE `MyTable` (someId INT PRIMARY KEY)") } } @Test fun `MySQL unquoted`() { transaction(db = mysqlDb) { MyTableUnquoted.insert { it[someId] = 123 } val value = MyTableUnquoted.select(MyTableUnquoted.someId.eq(123)).single() assertEquals(123, value[MyTableUnquoted.someId]) } } @Test fun `MySQL quoted`() { transaction(db = mysqlDb) { MyTableQuoted.insert { it[someId] = 123 } val value = MyTableQuoted.select(MyTableQuoted.someId.eq(123)).single() assertEquals(123, value[MyTableQuoted.someId]) } } @Test fun `MySQL ANSI_QUOTES unquoted`() { transaction(db = mysqlDbAnsiQuotes) { MyTableUnquoted.insert { it[someId] = 123 } val value = MyTableUnquoted.select(MyTableUnquoted.someId.eq(123)).single() assertEquals(123, value[MyTableUnquoted.someId]) } } @Test fun `MySQL ANSI_QUOTES quoted`() { transaction(db = mysqlDbAnsiQuotes) { MyTableQuoted.insert { it[someId] = 123 } val value = MyTableQuoted.select(MyTableQuoted.someId.eq(123)).single() assertEquals(123, value[MyTableQuoted.someId]) } } } @Testcontainers class PostgresReproTest { @Container val pgsql = GenericContainer("postgres:15.2-alpine").apply { withExposedPorts(5432) withEnv("POSTGRES_PASSWORD", "postgres") } lateinit var postgresDb: Database @BeforeEach fun initDb() { postgresDb = Database.connect( "jdbc:postgresql://${pgsql.host}:${pgsql.firstMappedPort}/postgres", driver = "org.postgresql.Driver", user = "postgres", password = "postgres" ) transaction(db = postgresDb) { TransactionManager.current().exec("CREATE TABLE \"MyTable\" (\"someId\" INT PRIMARY KEY)") } } @Test fun `Postgres unquoted`() { transaction(db = postgresDb) { MyTableUnquoted.insert { it[someId] = 123 } val value = MyTableUnquoted.select(MyTableUnquoted.someId.eq(123)).single() assertEquals(123, value[MyTableUnquoted.someId]) } } @Test fun `Postgres quoted`() { transaction(db = postgresDb) { MyTableQuoted.insert { it[someId] = 123 } val value = MyTableQuoted.select(MyTableQuoted.someId.eq(123)).single() assertEquals(123, value[MyTableQuoted.someId]) } } } ```

TL;DR: Due to different quoting behavior, if you want your tables defined as : Table("\"MyTable\"") to work with both MySQL and PostgreSQL, add ?sessionVariables=sql_mode=ANSI_QUOTES at the end of your connection string for MySQL.

bog-walk commented 1 year ago

Thanks very much for the detailed response and the MySQL workaround @utybo .

Regarding the feature flag for auto-escaping identifiers, I've requested some feedback in another issue (issue #1658), but would appreciate your thoughts here too:

Proposal:

Questions:

utybo commented 1 year ago

Questions:

  • Would the expectation be that this flag is global, with a blanket effect on all table objects?
  • If so, what if the user then wants to exclude a single table from the flag? Is it expected that opt-in/opt-out would also be configurable on a table-by-table basis?

Hello @bog-walk , apologies for the delay on my end on this. I think having both options would be good. Having both a global flag and a table-level flag would help to fix these trickier cases (see the table below).

For example, here's an annotation at the table level:

annotation class PreserveTableNameCasing(val enabled: Boolean = true)

@PreserveTableNameCasing
class SomeTable : Table("SoMETable") {
    // ...
}

And a global flag (not sure if this is the best place for this):

Database.connect(
    // ...
    databaseConfig = DatabaseConfig {
         preserveTableNameCasing = true
    }
)

The logic here would be that users can either:

Where A is do not preserve casing (the current behavior) and B is preserve casing:

Global flag \ Table annotation Absent enabled = true enabled = false
false or absent A B A
true B B A

All of this would be more work than just having one or the other option, but is the better way to go in my opinion. Or, if having both options is not doable, have only the table annotation, that would still be the more flexible option.

valeriyo commented 1 month ago

Table names are still mangled in 0.52.0.

In Postgres, actual table name is "MyTable", using Table("MyTable"), but when selecting the generated SQL refers to "mytable" which is not found :(

PSQLException: ERROR: relation "mytable" does not exist
bog-walk commented 1 month ago

@valeriyo This is the expected behavior based on PostgreSQL's case insensitive identifier syntax and following PostgreSQL's own logic of folding unquoted identifiers to lower case. Even if the generated SQL didn't show what was going to happen and the following plain SQL was sent to the database untouched by Exposed, it would still create a new table called "mytable" not "MyTable" and querying metadata for all existing tables would return public.mytable: exec("CREATE TABLE IF NOT EXISTS MyTable (my_number INT NOT NULL)")

If there is already an existing table in the database that is actually named "MyTable", that could only have been accomplished by quoting the identifier to make it case sensitive: exec("CREATE TABLE IF NOT EXISTS \"MyTable\" (my_number INT NOT NULL)") And all future statements/queries would need to then also quote the table name to not throw exceptions. Currently you can inform Exposed that the name is case sensitive by manually quoting it Table("\"MyTable\"").

So I don't see how it was possible for Exposed to create an actual table name "MyTable" using Table("MyTable") as you mentioned above. Please clarify whether or not Exposed was used to create the table or if it already existed.

And if we do choose to implement any of the flags proposed above, it will mean Exposed purposefully quoting every flagged identifier so that the user doesn't have to, in order to purposefully go against the normal lower case folding, not the other way around.

valeriyo commented 1 month ago

Yes, I create tables using actual SQL files with flyway to manage schema migrations, a la:

CREATE TABLE "MyTable" (...)

The question is -- when I specify Table("MyTable") -- why is that not sufficient for Exposed to treat it as such? I didn't write Table("mytable") - I specifically chose "MyTable" because that's what I wanted. So, why would it let it be mangled by a freaking old SQL naming convention that's as out of date as MS-DOS with its 8.3 FAT file system?

Me: Table("MyTable") please! 🙂 Exposed: Nah, man, you really didn't mean it.... "mytable" it is! 🤪 Me: Why are you serving me "mytable"? 😰

Me: Table("\"MyTable\"") WTF?! 🤬 Exposed: Okay, okay, I'll do what you want, but don't be mad if things go wrong at some point. 😬 Me: I don't want to sign up for that, doesn't look right 🤕

Either way, stuff like this doesn't instill confidence.