Closed GoogleCodeExporter closed 9 years ago
I would like to alter the proposed syntax slightly:-
create dataset <IDENTIFIER> ( <IDENTIFIER> )
partitioned by key <IDENTIFIER> (,<IDENTIFIER>)*
hints (IDENTIFIER=<STRING_LITERAL> (,<IDENTIFIER>=<STRING_LITERAL>)*);
so that we have:-
create dataset X(TypeY)
partitioned by key id
hints (TUPLE_SIZE="250", NUM_TUPLES="250000") ;
This is for supporting hints with arbitrary value, not necessarily complying
with the definition of an IDENTIFIER.
Original comment by RamanGro...@gmail.com
on 29 Jan 2013 at 4:28
Looks okay. Let's be careful though - I want to REGULATE these hints HIGHLY.
The only one I am okay with initially is expected cardinality. Period. Users
have no way of estimating bytes in tuples. And we don't even have tuples.
CARDINALITY should be the name of our only initial hint, IMO. We *MUST* avoid
physical hints - that is a very nasty/slippery slope. And hints are generally
evil and hard to use.
Original comment by dtab...@gmail.com
on 29 Jan 2013 at 8:44
Where is the size of tuples used? I know Bloom filter doesn't require this
number. If the user doesn't provide an estimation for this size, what can the
engine do?
Original comment by che...@gmail.com
on 29 Jan 2013 at 8:51
Agreed with Mike. No TUPLE_SIZE and change NUM_TUPLES to CARDINALITY.
Original comment by vinay...@gmail.com
on 29 Jan 2013 at 8:51
The hint TUPLE_SIZE was originally proposed for helping in determining the
cardinality of a node group for a dataset (when used in conjunction with the
CARDINALITY_HINT).
We want to hide the concept of a node group from the end-user. Asterix needs to
form a node group and it cannot be the default (ALL_NODES) as it may not be
optimal. In the upcoming release, we may not address this issue and rely on the
default node group. However we do need to revisit this and have a better way of
forming a node group for a dataset.
It does not play any role for the current bloom filter issue.
Original comment by ram...@uci.edu
on 29 Jan 2013 at 10:23
Committed into private branch: asterix_stabilization_issue_251
ASTERIX allows user to give additional information in form of hints.
These hints can come handy in scenarios such as determining other parameters
like the size of bloom filter to hold data.
To begin with, the onyl hint supported by Asterix is the 'CARDINALITY' hint.
CARDINALITY gives the expected number of tuples in the dataset.
An example create dataset statement that provides hints is given below:-
create dataset X(TypeY)
partitioned by key id
hints (CARDINALITY=2500);
Please note that hints are case-insensitive.
Test Cases:-
Positive
asterix-app/src/test/resources/metadata/queries/basic/issue_251_dataset_hint_1.a
ql
asterix-app/src/test/resources/metadata/queries/basic/issue_251_dataset_hint_2.a
ql
asterix-app/src/test/resources/metadata/queries/basic/issue_251_dataset_hint_3.a
ql
asterix-app/src/test/resources/metadata/queries/basic/issue_251_dataset_hint_4.a
ql
Negative
asterix-app/src/test/resources/metadata/queries/exception/issue_251_dataset_hint
_error_1.aql
asterix-app/src/test/resources/metadata/queries/exception/issue_251_dataset_hint
_error_2.aql
Original comment by RamanGro...@gmail.com
on 30 Jan 2013 at 5:35
From the commit I see from the tests that the HINT is in lower case in some
tests, like in the following two tests.
asterix-app/src/test/resources/metadata/queries/basic/issue_251_dataset_hint_1.a
ql
create dataset Book(LineType)
partitioned by key id
hints( cardinality = 2000);
asterix-app/src/test/resources/metadata/queries/basic/issue_251_dataset_hint_2.a
ql
create dataset Book(LineType)
partitioned by key id
hints(cardinality=2000);
Whereas in this test the HINT is in uppercase
asterix-app/src/test/resources/metadata/queries/basic/issue_251_dataset_hint_3.a
ql
Which is the correct (supported) behavior ? uppercase/lowercase ?
Original comment by khfaraaz82
on 30 Jan 2013 at 7:08
What is the maximum size (number) that user can specify for cardinality in the
HINT ? And what is the expected behavior if user gives a very high cardinality,
for example MAX_VALUE of long type in Java ? I assume negative cardinality will
be handled appropriately.
Original comment by khfaraaz82
on 30 Jan 2013 at 7:13
As stated in the checkin log, the hint name is case insensitive.
The value in case of the "cardinality" hint is an integer that lies in the
allowed range 0 and INTEGER.MAX.
Negative value is not permitted.
The syntax for providing hints supports a comma separated set of key value
pairs with key as hint name and value being the actual value. The value can be
a string also, in which case it must be surrounded with double quotes.
To begin with we support only one kind of hint (cardinality). If any other hint
is given appropriate error message is provided.
Original comment by RamanGro...@gmail.com
on 30 Jan 2013 at 7:59
Khurram, the use of different case (upper and lower) in the test cases is
intentional.
Original comment by RamanGro...@gmail.com
on 30 Jan 2013 at 8:02
released for code review by Sattam.
Original comment by RamanGro...@gmail.com
on 30 Jan 2013 at 7:02
Fixed in asterix_stabilization: r1175
Original comment by ram...@uci.edu
on 11 Feb 2013 at 3:40
Original comment by RamanGro...@gmail.com
on 11 Feb 2013 at 3:40
Original issue reported on code.google.com by
RamanGro...@gmail.com
on 29 Jan 2013 at 3:33