Open cberner opened 9 years ago
+1
I don't think the last one is correct. Case insensitive matching should only happen if the target is the correct case.
You can have multiple identifiers differing only in case. Case insensitive matching should match at most one.
@electrum correct. I updated the description.
Is there any plan to get this fixed so I could use Upper case in table name?
I took a look at this a few weeks back. The problem is the change is spread throughout the codebase and in parts that are actively being changed. After @martint's planner changes are in and @haozhun's changes to field dereferencing, the change should be doable.
Is there any other way to use upper case in database name and table name while querying from presto
@ashish6976 This issues is about changing the current Presto behavior to support full case sensitive identifiers.
From the users perspective, the current behavior of only supporting case insensitive identifiers should just work, unless you happen to have an existing system containing identifiers that only differ in case. If you are seeing failures when there is only one identifier, please file a new issue since that is a bug.
@dain Here are the scenarios in which I am facing isssues
My MYSQL Server is running on Centos. I am using presto to query my MYSQL database using MYSQL connector where my catalog name is mysql.
Database Name - TestDB Table Names - EmployeeDetails, EmployeeTable Query 1 - show schemas from mysql; Output - Schema
information_schema
performance_schema
testdb
(3 rows)
Query 20150818_064410_00003_837eu, FINISHED, 1 node Splits: 2 total, 2 done (100.00%) 0:00 [3 rows, 61B] [25 rows/s, 524B/s]
Query 2 - show tables from mysql.testdb; Output - Table
(0 rows)
Query 20150818_064532_00004_837eu, FINISHED, 1 node Splits: 2 total, 2 done (100.00%) 0:00 [0 rows, 0B] [0 rows/s, 0B/s]
In this case presto is not able to Fetch the table names which are present in database TestDB.
The Mysql output mysql> show tables from TestDB; +------------------+ | Tables_in_TestDB | +------------------+ | EmployeeDetails | | EmployeeTable | +------------------+ 2 rows in set (0.00 sec)
Database Name - lowercasedb Table Names - TableOne, TableTwo Query 1 - show schemas from mysql; Output - Schema
information_schema
lowercasedb
performance_schema
testdb
(4 rows)
Query 20150818_065347_00005_837eu, FINISHED, 1 node Splits: 2 total, 2 done (100.00%) 0:00 [4 rows, 77B] [27 rows/s, 522B/s]
Query 2 - show tables from mysql.lowercasedb; Output - Table
tableone tabletwo (2 rows)
Query 20150818_065432_00006_837eu, FINISHED, 1 node Splits: 2 total, 2 done (100.00%) 0:00 [2 rows, 66B] [15 rows/s, 505B/s]
Query 3 - select * from mysql.lowercasedb.tableone; Output - Query 20150818_065535_00007_837eu failed: Table mysql.lowercasedb.tableone does not exist
In this scenario presto is able to fetch the table names but when I am accessing the table the its giving me an error as shown above.
The Mysql output mysql> select * from lowercasedb.TableOne; +-----------+-----------+ | ColumnOne | ColumnTwo | +-----------+-----------+ | 1 | Row 1 | | 2 | Row 2 | +-----------+-----------+ 2 rows in set (0.00 sec)
Database Name - lowercasedb Table Names - tableone, tabletwo Query 1 - show schemas from mysql; Output - Schema
information_schema
lowercasedb
lowercasetabledb
performance_schema
testdb
(5 rows)
Query 20150818_070234_00008_837eu, FINISHED, 1 node Splits: 2 total, 2 done (100.00%) 0:00 [5 rows, 98B] [30 rows/s, 597B/s]
Query 2 - show tables from mysql.lowercasetabledb; Output - Table
tableone tabletwo (2 rows)
Query 20150818_070253_00009_837eu, FINISHED, 1 node Splits: 2 total, 2 done (100.00%) 0:00 [2 rows, 76B] [17 rows/s, 652B/s]
Query 3 - select * from mysql.lowercasetabledb.tableone;
Output -
columnone | columntwo
-----------+-----------
1 | Row 1
2 | Row 2
(2 rows)
Query 20150818_070319_00010_837eu, FINISHED, 1 node Splits: 2 total, 2 done (100.00%) 0:00 [2 rows, 0B] [8 rows/s, 0B/s]
In this scenario I am able to access the tables in the database.
@ashish6976 This is a bug in the mysql connector. Please open a new issue for the mysql connector.
Any progress on this?
I have the same issue as @ashish6976 but in my case the database is MongoDB. Some of the database names and collections are in title case per company standards. When Presto loads the catalog, it converts everything to lowercase and results in zero records being processed. The only thing that works is, when a collection name or database name is lowercase.
Are there any alternative to overcome this?
Any update about this ?
I've read the source a bit. There are places in the Presto code like this https://github.com/prestodb/presto/blob/master/presto-spi/src/main/java/com/facebook/presto/spi/ColumnMetadata.java#L49
I think we have to remove those toLowerCase().
Is there a fix for this? I'm not able to query upper case tables
Any update about the fix?
I am using 0.180 version of presto and issue is still present. We have all our mysql table names in UPPERCASE and any alternative or patches available..?
Hi there, Is the issue about tables in upper case in MySQL and a Presto connection fixed? :) Cheers
I'm testing out Presto with MySQL, running into this exact issue. Is there any will, at all, to fix this rather big issue? Clearly there's more than 1 or 2 people, and this would prevent us from using Presto as our tables are managed by Hibernate, which creates them. I have seen https://dev.mysql.com/doc/refman/5.7/en/identifier-case-sensitivity.html, which setting it to 2 might work. I'm having trouble getting my MySQL instance to get that set properly though, but I'll report back if I manage to and if that works.
So, sadly I can't get it set to '2', I suppose it's because I'm running on Linux, and so mysql really doesn't like it so it keeps reverting to 0. I can get it set to 1, which simply lowercases all table names and stores them on disk as lowercased, but that's not so useful for lots of existing tables.
For reference, here are the relevant parts from the SQL spec:
<delimited identifier> ::=
<double quote> <delimited identifier body> <double quote>
<delimited identifier body> ::= <delimited identifier part>...
<delimited identifier part> ::=
<nondoublequote character>
| <doublequote symbol>
<Unicode delimited identifier> ::=
U <ampersand> <double quote> <Unicode delimiter body> <double quote>
<Unicode escape specifier>
<Unicode escape specifier> ::=
[ UESCAPE <quote> <Unicode escape character> <quote> ]
<Unicode delimiter body> ::=
<Unicode identifier part>...
<Unicode identifier part> ::=
<delimited identifier part>
| <Unicode escape value>
24) For every <identifier body> IB there is exactly one corresponding case-normal form CNF. CNF is an <identifier body> derived from IB as follows:
Let n be the number of characters in IB. For i ranging from 1 (one) to n, the i-th character Mi of IB is transliterated into the corresponding character
or characters of CNF as follows:
Case:
a) If Mi is a lower case character or a title case character for which an equivalent upper case sequence U is de ned by Unicode, then let j be th
e number of characters in U; the next j characters of CNF are U.
b) Otherwise, the next character of CNF is Mi.
25) The case-normal form of the <identifier body> of a <regular identifier> is used for purposes such as and including determination of identifier
equivalence, representation in the Definition and Information Schemas, and representation in diagnostics areas.
...
27) Two <regular identifier>s are equivalent if the case-normal forms of their <identifier body>s, considered as the repetition of a <character string literal>
that specifies a <character set specification> of SQL_IDENTIFIER and an implementation-defined collation IDC that is sensitive to case, compare equally
according to the comparison rules in Subclause 8.2, “<comparison predicate>”.
28) A <regular identifier> and a <delimited identifier> are equivalent if the case-normal form of the <identifier body> of the <regular identifier> and the
<delimited identifier body> of the <delimited identifier> (with all occurrences of <quote> replaced by <quote symbol> and all occurrences of
<doublequote symbol> replaced by <double quote>), considered as the repetition of a <character string literal> that specifies a <character set specification>
of SQL_IDENTIFIER and IDC, compare equally according to the comparison rules in Subclause 8.2, “<comparison predicate>”.
29) Two<delimited identifier>s are equivalent if their <delimited identifierbody>s,considered as the repetition of a <character string literal> that specifies
a <character set specification> of SQL_IDENTIFIER and an implementation-defined collation that is sensitive to case, compare equally according to the
comparison rules in Subclause 8.2, “<comparison predicate>”.
30) Two <Unicode delimited identifier>s are equivalent if their <Unicode delimiter body>s, considered as the repetition of a <character string literal> that
specifies a <character set specification> of SQL_IDENTIFIER and an implementation-defined collation that is sensitive to case, compare equally according
to the comparison rules in Subclause 8.2, “<comparison predicate>”.
31) A <Unicode delimited identifier> and a <delimited identifier> are equivalent if their <Unicode delimiter body> and <delimited identifier body>,
respectively, each considered as the repetition of a <character string literal> that specifies a <character set specification> of SQL_IDENTIFIER and
an implementation-defined collation that is sensitive to case, compare equally according to the comparison rules in Subclause 8.2, “<comparison predicate>”.
32) A <regular identifier> and a <Unicode delimited identifier> are equivalent if the case-normal form of the <identifier body> of the <regular identifier>
and the <Unicode delimiter body> of the <Unicode delimited identifier> considered as the repetition of a <character string literal>, each specifying a
<character set specification> of SQL_IDENTIFIER and an implementation-defined collation that is sensitive to case, compare equally according to the
comparison rules in Subclause 8.2, “<comparison predicate>”.
+1 Is there any progess on this? We are useing version 0.192 and the problem with Scenario 2 is still there.
I have an open PR https://github.com/prestodb/presto/pull/8674 that got stalled waiting for some replies from maintainers, and then I got busy with work and unable to continue it. I don't see any time in the foreseeable future for me to pick this back up.
This is still an issue currently. Is there any alternative available?
Any update ?
Any update ?
Unfortunately, no ;(
@hgschmie told me the other day he was looking at this
Any update on this? All the tables in my project are by default in upper case. Or can any pointers be given in the code as to what can be changed so that a custom build for the same can be made?
any update? hive metastore tables in mysql like DBS, PARTITIONS can't be queried.
presto> select * from mysql.metastore.DBS; Query 20180913_015448_00010_774as failed: line 1:15: Table mysql.metastore.dbs does not exist
Any updae on this issue? My collections in mongodb have names with upper case. Presto can't read these collections because it always change lower case.
I'm getting the same issue with Sqlserver connectors and tables defined in uppercase... ( only be able to query tables where names are defined in lower case.... :-( (presto 0.203)
I no longer am actively working on Presto
Any update on this? Thanks in advance.
Fixed at Release 0.225.
Match schema and table names case insensitively. This behavior can be enabled by setting the case-insensitive-name-matching
catalog configuration option to true.
I have a table that that only differ by case,for example ,columns ‘toPkgVErsName ' and ‘ toPkGVersName‘ . I can execute with 'select topkgversname from ......' correctly when i use hive.But the same sql in presto will be wrong (I use hue to connect with hive and presto). and the error message is :
Multiple entries with same key: [topkgversname] required binary topkgversname (STRING)=min: , max: , num_nulls: 0 and [topkgversname] required binary topkgversname (STRING)=min: , max: walleve-1.0.269.171227.eaddc8f, num_nulls: 0 at io.prestosql.jdbc.AbstractPrestoResultSet.resultsException(AbstractPrestoResultSet.java:1731) at io.prestosql.jdbc.PrestoResultSet$ResultsPageIterator.computeNext(PrestoResultSet.java:216) at io.prestosql.jdbc.PrestoResultSet$ResultsPageIterator.computeNext(PrestoResultSet.java:176) at io.prestosql.jdbc.$internal.guava.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:141) at io.prestosql.jdbc.$internal.guava.collect.AbstractIterator.hasNext(AbstractIterator.java:136) at java.util.Spliterators$IteratorSpliterator.tryAdvance(Spliterators.java:1811) at java.util.stream.StreamSpliterators$WrappingSpliterator.lambda$initPartialTraversalState$0(StreamSpliterators.java:295) at java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.fillBuffer(StreamSpliterators.java:207) at java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.doAdvance(StreamSpliterators.java:162) at java.util.stream.StreamSpliterators$WrappingSpliterator.tryAdvance(StreamSpliterators.java:301) at java.util.Spliterators$1Adapter.hasNext(Spliterators.java:681) at io.prestosql.jdbc.PrestoResultSet$AsyncIterator.lambda$new$0(PrestoResultSet.java:122) at java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1640) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
Is there any setting that could fix this problem?
@yys12138 It seems you are using Trino formerly PrestoSQL. The right repository is https://github.com/trinodb/trino and you can ask questions in the community Slack. https://trino.io/slack.html