oltpbenchmark / oltpbench

Database Benchmarking Framework
http://oltpbenchmark.com/
Other
402 stars 267 forks source link

Case sensitivity issue DB2 LUW #15

Open mgudmund opened 11 years ago

mgudmund commented 11 years ago

When I try run the TPCC benchmark against a DB2 Linux database i get problems. I run with the following command: ./oltpbenchmark -b tpcc -c config/tpcc_config_db2.xml --create=true --load=true --execute=true -s 5 -o outputfile

The problem arises during load, and what happens is that the inserts are using tablenamse surrounded by ". Since the tables are created without, DB2 stores them as uppercase. Whe you the try to do inserts and name them with lowercase in within "-chars, DB2 searches for the tables in lowercase, because that's how the are named in your TPCCConstants.

So, the load failes with SQLCODE -204(table not found.)

I managed to get by this by making sure the method getEscapedName never is used in SQLUtil.java, method getInsertStatement. Instead I replaced it with getName instead.

This needs to be sorted out to make sure you are being consistent on case in tablenames. either create them as lowercase in db2, or user uppercase all they way through.

Best regards, Martin

mgudmund commented 11 years ago

Hi! Look like the same issue exists in clearDatabase() when runinng with --clear=true

Br, Martin

apavlo commented 11 years ago

I remember that it was a problem to try to keep everything consistent between the various systems. Some DBMSs will automatically uppercase everything and others would automatically lowercase everything. I think the solution was to just uppercase everything (in the queries + DDL) and then use the getEscapedName() like you did . Can you try that and see if that works for DB2? Then you won't have to hack up the core API code.

We also had a problem where the keyword "user" was reserved in some systems but not others.

apavlo commented 11 years ago

In looking through the code real quickly, unfortunately we're inconsistent with our capitalization in the queries and DDLS. This is partly because Carlo and I mashed our two benchmarking frameworks together.

The way that you're really suppose to workaround this problem is to use the dialects.xml file for the benchmark to correct any capitalization issues for specific queries.

mgudmund commented 11 years ago

Ok, I will try this. But as far as I can tell, in the dialects.xml only selects are specified. This only happens during the load phase, i.e Inserts. Or do i add a new statemnent name and specify the insert statemnt name?

apavlo commented 11 years ago

Sorry, I missed your earlier point.

It looks like that there a DB2-specific DDL file for TPC-C. The framework will automatically load that if you set the <dbtype> to DB2 in your config XML file.

If you change the case for the columns in that file, does that fix your problem? If so, can you send me a pull request with your fixes?

I'm sorry that I can't look further into a more elegant solution to fixing this. I'm scrambling to finish a bunch of deadlines this month. We were going to try to use a third-party library to automatically translate the SQL for each DBMS, but we ran out of time last year.

mgudmund commented 11 years ago

OK, think I fixed it. I tried it before but named my dbtype db2 instead of DB2. That's probably why it didn't work.

Anyway, to fix the problem, i just added "" around the tablenames in tpcc-db2-ddl.sql. This tells the database to actually use lowercase. Without them they are translated to uppercase.

DB2 is following ANSI standards so, I guess other databases might run into the same issues. I will try Oracle as well in a couple of days when DB2 is OK for me.

Not sure how to do a pull request, but i'll try to figure it out.

Thanks for you support.

Br, Martin

mgudmund commented 11 years ago

OK, changing the ddl fixed the load issue.

But then when the tpcc test starts it fails because that part looks for the tables in uppercase. So, it's a catch 22. I think we need to fix the loader to not quote the table names.

And remove the quoted names from the ddl file again.

I'll dig deeper into the code and try to find what's causing the quoted tablenames in the load and clear parts.

Br, Martin

davidlday commented 10 years ago

I'm still testing, but I may have the fix. Unfortunately, I don't have a DB2 instance set up, and probably won't for a few more months. Here's what I did in case you want to test.

1) Convert the generic DDL file (tpcc-ddl.sql) to uppercase. 2) Convert the string values (table names) in TPCCConstants.java to uppercase.

So far I'm able to run for Oracle, SQL Server, and MySQL. However, for MySQL, I also had to convert tpcc-mysql-ddl.sql to uppercase.

I suggest the convention be to always use uppercase in DDL and DML. I'll commit to my branch and request merge by end of week if everything pans out.

mgudmund commented 10 years ago

ons 2013-12-04 klockan 06:31 -0800 skrev David L. Day:

I'm still testing, but I may have the fix. Unfortunately, I don't have a DB2 instance set up, and probably won't for a few more months. Here's what I did in case you want to test. Yes I will try! 1) Convert the generic DDL file (tpcc-ddl.sql) to uppercase. 2) Convert the string values (table names) in TPCCConstants.java to uppercase. Thanks. Seems easy enough. So far I'm able to run for Oracle, SQL Server, and MySQL. However, for MySQL, I also had to convert tpcc-mysql-ddl.sql to uppercase.

I suggest the convention be to always use uppercase in DDL and DML. I'll commit to my branch and request merge by end of week if everything pans out. Agreed. Uppercase is better than lowercase since no database will convert upper to lower. BUt some will convert lower to upper. Either way it would be beneficial to do it in an explicit and a common way for all supported databases. Thanks for looking into this.

I I have a chance to test it, i will get back to you with results.

Br, Martin

— Reply to this email directly or view it on GitHub.