orientechnologies / orientdb

OrientDB is the most versatile DBMS supporting Graph, Document, Reactive, Full-Text and Geospatial models in one Multi-Model product. OrientDB can run distributed (Multi-Master), supports SQL, ACID Transactions, Full-Text indexing and Reactive Queries.
https://orientdb.dev
Apache License 2.0
4.73k stars 871 forks source link

SELECT value behaviour #5992

Closed Severogor closed 3 years ago

Severogor commented 8 years ago

Hello, Is this SELECT behaviour correct?

orientdb {db=data}> select 'x'

----+------+----
#   |@CLASS|x   
----+------+----
0   |null  |x   
----+------+----

1 item(s) found. Query executed in 0.001 sec(s).

orientdb {db=data}> select '' 

Error: com.orientechnologies.orient.core.exception.OCommandExecutionException: Error on execution of command: sql.select '' 

Error: java.lang.IllegalArgumentException: Field name is empty 

orientdb {db=data}> select ' ' 

Error: com.orientechnologies.orient.core.exception.OCommandExecutionException: Error on execution of command: sql.select ' ' 

Error: java.lang.IllegalArgumentException: Name is empty 

orientdb {db=data}> select ' x ' 

----+------+---- 
#   |@CLASS| x  
----+------+---- 
0   |null  | x  
----+------+---- 

1 item(s) found. Query executed in 0.001 sec(s). 

I don't think so... PostgreSQL too.

test=# select '';
 ?column? 
----------

(1 row)

What's the problem with selecting an empty (though a space isn't an empty, actually) value?

smolinari commented 8 years ago

What use would there be for having a space as a field name?

Scott

luigidellaquila commented 8 years ago

Hi guys,

I don't like this behavior, I think I can change it and have a default fallback to "value" as field name in cases like this when you cannot infer an alias from the projection

WDYT?

Thanks

Luigi

Severogor commented 8 years ago

Hello,

I think this behaviour is irrational. If space is not usable as a field name, why space + non-space is supposed to be usable?

I wonder that if variable is supposed to have a value in "SELECT variable;"

Here is a simple example:

values = ['0', 'x', ' ', '']  # suppose that the array is a set of
arbitrary string values
for x in values:
    query = "select '" + escape(x) + "';"
    db.query(query)

This code successfully iterates through values[0] and values[1], but fails on values[2] and values[3], though the query is syntactically (and semantically) correct.

Could you tell me please, from what assumption and for what practical purpose OrientDB field name relies on (defaults to) its value and breaks straightforward logic therefore? For naming purposes we do ALWAYS have the AS ALIAS construct, don't we?

Best regards, Sergey

On Sat, Apr 16, 2016 at 9:09 PM, luigidellaquila notifications@github.com wrote:

Hi guys,

I don't like this behavior, I think I can change it and have a default fallback to "value" as field name in cases like this when you cannot infer an alias from the projection

WDYT?

Thanks

Luigi

— You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub https://github.com/orientechnologies/orientdb/issues/5992#issuecomment-210860349

smolinari commented 8 years ago

@luigidellaquila - You lost me to be honest. I would most definitely bow to your experience, wisdom and knowledge about databases and in particular ODB, however, I don't understand how defaulting to a possibly totally unknown value could be intuitive for the user.

I am also completely lost on how aliases are involved in all this. Are you considering just piping out a empty value, if the alias is there, but the field is a space or nothing? Like

SELECT field_1 AS Alias , ' ' AS Nothing FROM SomeClass

and as long as the rest of the query is fine, you would return "Nothing" as an empty value? How could that be useful in a programming scenario? Why would I want to waste network traffic on that (even if it is just a few bytes)?

@Severogor - just because a query may be syntactically correct, it doesn't mean it is logical. I still don't understand why you would expect to have field names with no value or a space as the value. I think ODB failing on something so illogical is perfectly fine. It is telling you, your request is invalid, because, in fact, it is. The Postgre result is also an error, just not as obvious as ODB's, right?

As for extra spaces in the projections, it seems also fine to me for ODB to trim them off. So, ODB correctly assumes "x" is the field name.

Sorry, maybe this is all totally above my competence level. I am just trying to understand the real issue at hand and why it is an issue at all.

Scott

luigidellaquila commented 8 years ago

Hi guys,

@Severogor objection makes sense. The only problem here is that white spaces are not supported in property names, also the following will fail

select "foo bar" as foobar

At this stage there is no practical reason for this at parsing level (there was before, but with the new parser these limits are solved), but in the executor it still has to be supported. I'll work on this in next releases, so you can consider this an enhancement.

Thanks

Luigi

smolinari commented 8 years ago

Ok. I'll bow out still not understanding the problem. Maybe when the solution is made available, I'll understand the problem.

Scott

luigidellaquila commented 8 years ago

Hi Scott,

I think the only real problem here is the exception that is being thrown.

Having a field with an alias that is an empty string (or only blank characters) is not a very common use case, but having a field name with blank spaces in the middle (eg "first name") is definitely more common, so I think it's worth supporting it ;-)

Thanks

Luigi

smolinari commented 8 years ago

Oh, I can understand wanting spaces in the field name (between words). If that is the problem, I'd agree it would be nice to support. I didn't understand wanting to support a blank or space as a field name alone. Still, spaces in field names sounds to me like something exotic to want to have.

I would assume, if there are to be spaces in the names, then those identifiers would have put inside double quotes?

SELECT "A field with spaces", "Another field with spaces" FROM someClass

???

Scott

nagarajasr commented 8 years ago

:+1: for implementing field names and aliases with an embedded space. but wouldn't it break other APIs (e.g., JDBC ?) to have only an empty value as field name or alias?

luigidellaquila commented 8 years ago

More or less, they should be back-tick quoted (like in MySQL ;-) )

SELECT `A field with spaces`, `Another field with spaces` FROM someClass
luigidellaquila commented 8 years ago

@nagarajasr

I don't think so, I as I said before, also MySQL supports back-tick quoted field names with spaces. For sure we have to add some small things to ODocument to accept field names with special characters, but I'm already working on it for other use cases

Thanks

Luigi

Severogor commented 8 years ago

Hi,

If we are to talk about enhancements - here are some another shortcomings from my point of development:

1) Inconvenient default in & out aliases.

orientdb {db=data}> SELECT out.field1, out.field2, out.fieldN FROM something;

We get the following inconvenient "nothing-speaking" aliases out, out2, out3 etc:

----+-------------+----+-------------------+--------
#   |@CLASS|out |out2              |out3 
----+-------------+----+-------------------+--------

Why can't they be the after-dot (instead of before-dot) field names - field1, field2, fieldN etc? Like this

----+-------------+----+-------------------+--------
#   |@CLASS|field1 |field2              |fieldN 
----+-------------+----+-------------------+--------

2) MANDATORY Semicolon at the end of the string. The problem is that - for readability and aesthetics purposes I have all queries newline-formatted, for example:

SELECT
  column,
  column as smth
 FROM
  something
 WHERE
  column = value;

It works in binary protocol, but not in console. So when I need just to debug it - I need to back-format it - its a pain!!! Why not to bring back mandatory semicolon as in traditional SQL? Or at least make it work like it does in binary mode.

Thanks. S.

luigidellaquila commented 8 years ago

Hi @Severogor

+1 for both. For the second one I'm half way to a solution, there is a chance that it will be available as an option in a late 2.2 hotfix (probably not the first GA)

Thanks

Luigi

rdelangh commented 7 years ago

hi @luigidellaquila , in v2.2.18, this seems still to be an issue:

orientdb {db=mobile}> select UNIQ, UNIQ.indexOf('|') AS i, UNIQ.length() AS l FROM results_hr_imsi_20170305 WHERE QC='mobi' AND DATEHOUR='2017030712' LIMIT 5
+----+-----------------------------------------+----+----+
|#   |UNIQ                                     |i   |l   |
+----+-----------------------------------------+----+----+
|0   |nat|internet.bmbpartner.be|213.181.61.104|3   |41  |
|1   |nat|internet.bmbpartner.be|213.181.61.104|3   |41  |
|2   |nat|internet.bmbpartner.be|213.181.61.104|3   |41  |
|3   |nat|internet.bmbpartner.be|213.181.61.104|3   |41  |
|4   |nat|internet.bmbpartner.be|213.181.61.104|3   |41  |
+----+-----------------------------------------+----+----+

5 item(s) found. Query executed in 0.039 sec(s).

orientdb {db=mobile}> SELECT x, i, l, (l-i-1) FROM (select UNIQ AS x, UNIQ.indexOf('|') AS i, UNIQ.length() AS l FROM results_hr_imsi_20170305 WHERE QC='mobi' AND DATEHOUR='2017030712' LIMIT 5)
Error: com.orientechnologies.orient.core.exception.OCommandExecutionException: Error on execution of command: sql.select x, i, l, (l-i-1) FROM (select UNIQ AS x, UNIQ.indexOf('|') AS i, UNIQ.length() AS l FROM results_hr_imsi_20170305 WHERE QC='mobi' AND DATEHOUR='2017030712' LIMIT 5)
        DB name="mobile"
        DB name="mobile"

Error: java.lang.IllegalArgumentException: Field name is empty

orientdb {db=mobile}> SELECT x, i, l, l-i-1 AS k FROM (select UNIQ AS x, UNIQ.indexOf('|') AS i, UNIQ.length() AS l FROM results_hr_imsi_20170305 WHERE QC='mobi' AND DATEHOUR='2017030712' LIMIT 5)
+----+-----------------------------------------+----+----+----+
|#   |x                                        |i   |l   |l2  |
+----+-----------------------------------------+----+----+----+
|0   |nat|internet.bmbpartner.be|213.181.61.104|3   |41  |41  |
|1   |nat|internet.bmbpartner.be|213.181.61.104|3   |41  |41  |
|2   |nat|internet.bmbpartner.be|213.181.61.104|3   |41  |41  |
|3   |nat|internet.bmbpartner.be|213.181.61.104|3   |41  |41  |
|4   |nat|internet.bmbpartner.be|213.181.61.104|3   |41  |41  |
+----+-----------------------------------------+----+----+----+

5 item(s) found. Query executed in 0.031 sec(s).

orientdb {db=mobile}> SELECT x, i, l, (l-i-1) AS k FROM (select UNIQ AS x, UNIQ.indexOf('|') AS i, UNIQ.length() AS l FROM results_hr_imsi_20170305 WHERE  QC='mobi' AND DATEHOUR='2017030712' LIMIT 5)
+----+-----------------------------------------+----+----+-------------+
|#   |x                                        |i   |l   |k            |
+----+-----------------------------------------+----+----+-------------+
|0   |nat|internet.bmbpartner.be|213.181.61.104|3   |41  |sql.l - i - 1|
|1   |nat|internet.bmbpartner.be|213.181.61.104|3   |41  |sql.l - i - 1|
|2   |nat|internet.bmbpartner.be|213.181.61.104|3   |41  |sql.l - i - 1|
|3   |nat|internet.bmbpartner.be|213.181.61.104|3   |41  |sql.l - i - 1|
|4   |nat|internet.bmbpartner.be|213.181.61.104|3   |41  |sql.l - i - 1|
+----+-----------------------------------------+----+----+-------------+

5 item(s) found. Query executed in 0.03 sec(s).

orientdb {db=mobile}> SELECT x, i, i.type(), l, l.type(), l-i-1 FROM (select UNIQ AS x, UNIQ.indexOf('|') AS i, UNIQ.length() AS l FROM results_hr_imsi_20170305 WHERE QC='mobi' AND DATEHOUR='2017030712' LIMIT 5)
+----+-----------------------------------------+----+-------+----+-------+----+
|#   |x                                        |i   |i2     |l   |l2     |l23 |
+----+-----------------------------------------+----+-------+----+-------+----+
|0   |nat|internet.bmbpartner.be|213.181.61.104|3   |INTEGER|41  |INTEGER|41  |
|1   |nat|internet.bmbpartner.be|213.181.61.104|3   |INTEGER|41  |INTEGER|41  |
|2   |nat|internet.bmbpartner.be|213.181.61.104|3   |INTEGER|41  |INTEGER|41  |
|3   |nat|internet.bmbpartner.be|213.181.61.104|3   |INTEGER|41  |INTEGER|41  |
|4   |nat|internet.bmbpartner.be|213.181.61.104|3   |INTEGER|41  |INTEGER|41  |
+----+-----------------------------------------+----+-------+----+-------+----+

The methods "indexOf" and "lenght" are documented as returning integers (position and length), so what is the problem with returning their subtraction ? The result (see above: "sql.l - i - 1") is totally faulty, should be an integer result of that subtraction. The parentheses around "l-i-1" seem to trigger something totally weird.

But even when I leave out these parentheses, or assign that subtraction explicitly an alias (se above: "k"), then still the subtraction is not computed!

luigidellaquila commented 7 years ago

Hi @rdelangh

Maths operations are not supported in v 2.2.x, you have to use eval() function for that. I didn't try it, but I think you have to use a sub-query to make it work, eg.

SELECT *, eval(" l - i - 1 ") from (
   SELECT x, i, i.type(), l, l.type(),  
   FROM (
         select UNIQ AS x, UNIQ.indexOf('|') AS i, UNIQ.length() AS l 
         FROM results_hr_imsi_20170305 WHERE QC='mobi' AND DATEHOUR='2017030712' LIMIT 5
   )
)

Good news for the future is that maths operations are supported in v 3.0

Thanks

Luigi

rdelangh commented 7 years ago

@luigidellaquila ok, the "eval" method works for the arithmetic. How about the exception thrown if an output column cannot clearly be assigned a name ?

luigidellaquila commented 7 years ago

That should never happen for valid expressions, they always have valid names, but if you find a case where it happens please let me know, I just have to add a default alias. The problem with default aliases is that they can collide with schemaless document properties...

In v. 3.0 I managed this problem in a natural way, the default alias for an expression is just the expression itself as a string, eg. the default alias for l-i-1 is just `l - i - 1`

Thanks

Luigi