br1ghtyang / asterixdb

Automatically exported from code.google.com/p/asterixdb
0 stars 0 forks source link

Order by on undefined field in closed type internal dataset does not throw Exception #444

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?

Start asterix using managix on NC/CC and run the following statements from Web 
UI.

drop dataverse test if exists;
create dataverse test;
use dataverse test;

create type TestType as closed {
id:int32
}

create dataset t1(TestType) primary key id;

insert into dataset t1
for $l in range(1,100) return {"id":$l};

for $l in dataset t1
order by $l.name
return $l

The above query returns all results in order. However, please note that the 
type of the dataset t1 in defined as CLOSED and we should not allow any 
undefined/additional open fields to be used in the uery.

$l.name is undefined in internal dataset t1 and therefore system should catch 
this and throw an exception. Which it does not do currently, instead query 
returns results.

Original issue reported on code.google.com by khfaraaz82 on 14 May 2013 at 7:12

GoogleCodeExporter commented 8 years ago
Huh - this is actually kind of a fascinating issue!  As in, it seems to me it 
is working as designed.  But - it leads to the question - are we happy with 
what we have designed?  :-)

Our design says that missing information should be treated like null values.  
If you do the following query, the result is the empty set:

for $l in dataset t1
where not (is-null($l.name))
order by $l.name

What the query in the issue is doing is treating $l.name as being a 
legitimately evaluable expression that yields null - and then when you do an 
order-by, it sorts all the nulls together - everything is null in this case - 
so we are seeing all the data because it's all in the large group of instances 
with null $l.name values.

I think this is in some sense "right" as is - I'm not sure we want to declare 
it as illegal to mention a field that is known not to exist - which is the only 
way that we could declare the current behavior as wrong.  @Vinayak, @Till, any 
thoughts based on XQuery experiences?

Interesting!!!

Original comment by dtab...@gmail.com on 15 May 2013 at 12:32

GoogleCodeExporter commented 8 years ago
In the query used in this issue, $l.name should not be allowed in the query, 
since internal dataset t1 is defined to be of closed type ?

Original comment by khfaraaz82 on 15 May 2013 at 12:41

GoogleCodeExporter commented 8 years ago
@Vinayak, @Till,  what's your opinion?
Can we close this issue or need to fix it?

Original comment by buyingyi@gmail.com on 22 May 2013 at 7:46

GoogleCodeExporter commented 8 years ago

Original comment by buyingyi@gmail.com on 22 May 2013 at 11:32

GoogleCodeExporter commented 8 years ago
This version of the query shows why the result is indeed "as designed":

drop dataverse test if exists;
create dataverse test;
use dataverse test;

create type TestType as closed {
id:int32
}

create dataset t1(TestType) primary key id;

insert into dataset t1
for $l in range(1,100) return {"id":$l};

for $l in dataset t1
order by $l.name
return {"name": $l.name, "l-itself": $l};

Original comment by dtab...@gmail.com on 23 May 2013 at 1:21

GoogleCodeExporter commented 8 years ago
Yeah, agree!

Original comment by buyingyi@gmail.com on 23 May 2013 at 2:40

GoogleCodeExporter commented 8 years ago
I also agree. The XQuery equivalent would be a path step on a validated element 
for a child element that doesn't exist and where the child also is not allowed 
by the schema. And that'd result in the empty sequence, so that's the 
equivalent to null for us, right?

Original comment by westm...@gmail.com on 23 May 2013 at 5:42