br1ghtyang / asterixdb

Automatically exported from code.google.com/p/asterixdb
0 stars 0 forks source link

Query metadata broken on asterix_stabilization #192

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?

NOTE : This issue is seen only when run on the cluster (asterix cluster) . 
Running on a local machine (laptop) with one NC and 10 NC's does not reproduce 
the problem.

The problem was not exposed by existing tests since we test querying of 
metadata in one JVM in our tests, we do not have tests that do querying on 
metadata on a cluster of nodes. This is a TODO and tests will be added into the 
framework to cover this issue.

1. Start CC on asterix-master
2. Start ten NC's, one each on asterix01 through asterix010
3. on asterix-master using hyrackscli connect to asterix-master
4. create asterix application
5. From Web UI query metadata, like

use dataverse Metadata;

for $l in dataset('Dataset')
return $l

On the first attempt system returns correct results, however any subsequent 
attempt to query metadata fails and gives incorrect results, after the first 
attempt.

Here are results from Web UI, from the second attempt to query Metadata.

use dataverse Metadata;

for $l in dataset('Dataset')
return $l

Duration of all jobs: 0.0

Logical plan:

write [%0->$$0] -- |UNPARTITIONED|
  project ([$$0]) -- |UNPARTITIONED|
    unnest $$0 <- function-call: asterix:dataset, Args:[AString: {Dataset}] -- |UNPARTITIONED|
      empty-tuple-source -- |UNPARTITIONED|

Optimized logical plan:

write [%0->$$0]
-- SINK_WRITE  |PARTITIONED|
  exchange 
  -- RANDOM_MERGE_EXCHANGE  |PARTITIONED|
    project ([$$0])
    -- STREAM_PROJECT  |PARTITIONED|
      exchange 
      -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
        data-scan []<-[$$2, $$3, $$0] <- Metadata:Dataset
        -- DATASOURCE_SCAN  |PARTITIONED|
          exchange 
          -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
            empty-tuple-source
            -- EMPTY_TUPLE_SOURCE  |PARTITIONED|

Hyracks job:

{
 "connectors": [
  {
   "connector": {
    "display-name": "edu.uci.ics.hyracks.dataflow.std.connectors.OneToOneConnectorDescriptor[CDID:0]",
    "id": "CDID:0",
    "java-class": "edu.uci.ics.hyracks.dataflow.std.connectors.OneToOneConnectorDescriptor"
   },
   "in-operator-id": "ODID:2",
   "in-operator-port": 0,
   "out-operator-id": "ODID:0",
   "out-operator-port": 0
  },
  {
   "connector": {
    "display-name": "edu.uci.ics.hyracks.dataflow.std.connectors.OneToOneConnectorDescriptor[CDID:1]",
    "id": "CDID:1",
    "java-class": "edu.uci.ics.hyracks.dataflow.std.connectors.OneToOneConnectorDescriptor"
   },
   "in-operator-id": "ODID:0",
   "in-operator-port": 0,
   "out-operator-id": "ODID:3",
   "out-operator-port": 0
  },
  {
   "connector": {
    "display-name": "edu.uci.ics.hyracks.dataflow.std.connectors.MToNReplicatingConnectorDescriptor[CDID:2]",
    "id": "CDID:2",
    "java-class": "edu.uci.ics.hyracks.dataflow.std.connectors.MToNReplicatingConnectorDescriptor"
   },
   "in-operator-id": "ODID:3",
   "in-operator-port": 0,
   "out-operator-id": "ODID:1",
   "out-operator-port": 0
  }
 ],
 "operators": [
  {
   "display-name": "edu.uci.ics.hyracks.storage.am.btree.dataflow.BTreeSearchOperatorDescriptor[ODID:0]",
   "id": "ODID:0",
   "in-arity": 1,
   "java-class": "edu.uci.ics.hyracks.storage.am.btree.dataflow.BTreeSearchOperatorDescriptor",
   "out-arity": 1
  },
  {
   "display-name": "edu.uci.ics.hyracks.algebricks.runtime.operators.meta.AlgebricksMetaOperatorDescriptor[ODID:1]",
   "id": "ODID:1",
   "in-arity": 1,
   "java-class": "edu.uci.ics.hyracks.algebricks.runtime.operators.meta.AlgebricksMetaOperatorDescriptor",
   "micro-operators": ["sink-write [0] outputFile"],
   "out-arity": 0
  },
  {
   "display-name": "edu.uci.ics.hyracks.algebricks.runtime.operators.meta.AlgebricksMetaOperatorDescriptor[ODID:2]",
   "id": "ODID:2",
   "in-arity": 0,
   "java-class": "edu.uci.ics.hyracks.algebricks.runtime.operators.meta.AlgebricksMetaOperatorDescriptor",
   "micro-operators": ["ets"],
   "out-arity": 1
  },
  {
   "display-name": "edu.uci.ics.hyracks.algebricks.runtime.operators.meta.AlgebricksMetaOperatorDescriptor[ODID:3]",
   "id": "ODID:3",
   "in-arity": 1,
   "java-class": "edu.uci.ics.hyracks.algebricks.runtime.operators.meta.AlgebricksMetaOperatorDescriptor",
   "micro-operators": ["stream-project [2]"],
   "out-arity": 1
  }
 ]
}
[PARTITION_LOCATION(ODID:1, 0) in CONSTANT[nc1:java.lang.String], 
PARTITION_COUNT(ODID:1) in CONSTANT[1:java.lang.Integer], 
PARTITION_COUNT(ODID:0) in CONSTANT[1:java.lang.Integer], 
PARTITION_LOCATION(ODID:0, 0) in CONSTANT[nc1:java.lang.String], 
PARTITION_COUNT(ODID:3) in CONSTANT[1:java.lang.Integer], 
PARTITION_LOCATION(ODID:3, 0) in CONSTANT[nc1:java.lang.String], 
PARTITION_COUNT(ODID:2) in CONSTANT[1:java.lang.Integer], 
PARTITION_LOCATION(ODID:2, 0) in CONSTANT[nc1:java.lang.String]]

Duration: 0.054

Result:

nc1:/tmp/asterix_output/OUTPUT_2
Duration: 0.074

{ "NodeName": "nc1", "NumberOfCores": 0, "WorkingMemorySize": 0 }

What is the expected output? What do you see instead?

Return metadata datasets

Please use labels and text to provide additional information.

hyracks_asterix_stabilization revision 1869
/asterix_stabilization revision 691

test platform details

kfmohamm@asterix-master:~/khurram/google-code/asterix_stabilization$ uname -a
Linux asterix-master 2.6.31-19-generic-pae #56-Ubuntu SMP Thu Jan 28 02:29:51 
UTC 2010 i686 GNU/Linux

Original issue reported on code.google.com by khfaraaz82 on 30 Aug 2012 at 8:46

GoogleCodeExporter commented 8 years ago
Pouria has fixed this, needs code review/discussion and merge of changes into 
asterix_stabilization branch.

Original comment by khfaraaz82 on 21 Oct 2012 at 5:11

GoogleCodeExporter commented 8 years ago
I have a fix for this one, which is checked (By Alex) and ready to be 
checked-in.
But there is a question here:

This issue is fixed as a side effect of changes that had been made in VLDB-demo 
branch. so once (if) that branch gets merged, this issue will also be gone, 
while vldb-demo changes contain more general changes, and my fix is a *subset* 
of those changes (my fix will be removed during that merge as vldb-demo would 
change the assumption(s) about metadata node). So the question is since this 
issue does not block everyone, should we do my quick fix now, or should we wait 
for vldb-demo merge into stabilization ?  

Original comment by pouria.p...@gmail.com on 23 Oct 2012 at 5:55

GoogleCodeExporter commented 8 years ago

Original comment by khfaraaz82 on 1 Nov 2012 at 6:26

GoogleCodeExporter commented 8 years ago
Issue is fixed in VLDB Demo Branch.
Once that branch gets merged into stabilization, it will be gone.

Original comment by pouria.p...@gmail.com on 19 Nov 2012 at 7:29

GoogleCodeExporter commented 8 years ago

Original comment by vinay...@gmail.com on 7 Dec 2012 at 8:25

GoogleCodeExporter commented 8 years ago
This bug might be misleading. 
Here is what actually would have happened.
1) The OutputDir flag in asterix's test.properties is the output path for 
depositing the results.
    On a cluster this path should be on NFS so that results can be read/shown at the node running CC, though they are produced by any of the NCs.

2) An incorrect value for OutputDir was used, which was not on NFS.
   This is evident from the output: 
Result:
nc1:/tmp/asterix_output/OUTPUT_2   <=== look here!
Duration: 0.074

3) The results are on the local file system of nc1. 
4) When CC tries to look up results, it inadvertently reads a previously 
generated output file.
    at the path /tmp/asterix_output/OUTPUT_2 (taken to be on the local fs of CC)  
   In the original bug report, this file must have been produced from the query:-
   for $x in dataset('Metadata.Node') 
   return $x 
   that was run at some earlier point in time. 

This is a result of an incorrect asterix configuration. Ideally the output dir 
should be cleared when ASTERIX starts up (deserves to be logged as a bug!!!). 
But the reported behavior would manifest independent of that. 
With Madhusudhan's changes coming in for the result management, writing to NFS 
would not be required. 

I propose this bug to be 'invalid'. 
If you believe, this is not the case, please correct me and update the bug 
accordingly.  

Original comment by RamanGro...@gmail.com on 29 Jan 2013 at 8:35