lwhay / asterixdb

Automatically exported from code.google.com/p/asterixdb
0 stars 0 forks source link

AdmLexerException returned when loading data ending with a backslash character from ADM file #744

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Reproduction:
==============
Load a string ending with a backslash from an adm file.
Example: {"user" : "Li`mSnMAwkkWH\DYq[dlqAf\"}

Expectation:
============
Data should load successfully - instead you get an error:
Error while parsing.  Line: 25 Row: 87 Expecting:  [AdmLexerException]

Note that when you just evaluate the following expression:
{"user":"toto\"}
you get in return in the results:
{ "user": "toto\\" } // <== why the double backslash?

Stack trace from the cc.log:
============================
SEVERE: Job failed on account of:
edu.uci.ics.hyracks.api.exceptions.HyracksDataException: exception during 
reading from external data source

edu.uci.ics.hyracks.api.exceptions.HyracksException: Job failed on account of:
edu.uci.ics.hyracks.api.exceptions.HyracksDataException: exception during 
reading from external data source

    at edu.uci.ics.hyracks.control.cc.job.JobRun.waitForCompletion(JobRun.java:207)
    at edu.uci.ics.hyracks.control.cc.work.WaitForJobCompletionWork$1.run(WaitForJobCompletionWork.java:44)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:722)
Caused by: edu.uci.ics.hyracks.api.exceptions.HyracksDataException: 
edu.uci.ics.hyracks.api.exceptions.HyracksDataException: exception during 
reading from external data source
    at edu.uci.ics.hyracks.control.common.utils.ExceptionUtils.setNodeIds(ExceptionUtils.java:41)
    at edu.uci.ics.hyracks.control.nc.Task.run(Task.java:291)
    ... 3 more
Caused by: edu.uci.ics.hyracks.api.exceptions.HyracksDataException: exception 
during reading from external data source
    at edu.uci.ics.asterix.metadata.feeds.ExternalDataScanOperatorDescriptor$1.initialize(ExternalDataScanOperatorDescriptor.java:57)
    at edu.uci.ics.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.initialize(SuperActivityOperatorNodePushable.java:81)
    at edu.uci.ics.hyracks.control.nc.Task.run(Task.java:234)
    ... 3 more
Caused by: edu.uci.ics.hyracks.api.exceptions.HyracksDataException: 
edu.uci.ics.hyracks.api.exceptions.HyracksDataException: 
edu.uci.ics.asterix.common.exceptions.AsterixException: 
edu.uci.ics.asterix.runtime.operators.file.adm.AdmLexerException: Error while 
parsing.  Line: 25 Row: 87 Expecting: <CIRCLE_CONS>
    at edu.uci.ics.asterix.runtime.operators.file.AbstractTupleParser.parse(AbstractTupleParser.java:81)
    at edu.uci.ics.asterix.external.dataset.adapter.FileSystemBasedAdapter.start(FileSystemBasedAdapter.java:50)
    at edu.uci.ics.asterix.metadata.feeds.ExternalDataScanOperatorDescriptor$1.initialize(ExternalDataScanOperatorDescriptor.java:55)
    ... 5 more
Caused by: edu.uci.ics.hyracks.api.exceptions.HyracksDataException: 
edu.uci.ics.asterix.common.exceptions.AsterixException: 
edu.uci.ics.asterix.runtime.operators.file.adm.AdmLexerException: Error while 
parsing.  Line: 25 Row: 87 Expecting: <CIRCLE_CONS>
    at edu.uci.ics.asterix.runtime.operators.file.ADMDataParser.parse(ADMDataParser.java:82)
    at edu.uci.ics.asterix.runtime.operators.file.AbstractTupleParser.parse(AbstractTupleParser.java:68)
    ... 7 more
Caused by: edu.uci.ics.asterix.common.exceptions.AsterixException: 
edu.uci.ics.asterix.runtime.operators.file.adm.AdmLexerException: Error while 
parsing.  Line: 25 Row: 87 Expecting: <CIRCLE_CONS>
    at edu.uci.ics.asterix.runtime.operators.file.ADMDataParser.nextToken(ADMDataParser.java:675)
    at edu.uci.ics.asterix.runtime.operators.file.ADMDataParser.parseRecord(ADMDataParser.java:463)
    at edu.uci.ics.asterix.runtime.operators.file.ADMDataParser.admFromLexerStream(ADMDataParser.java:346)
    at edu.uci.ics.asterix.runtime.operators.file.ADMDataParser.parseUnorderedList(ADMDataParser.java:661)
    at edu.uci.ics.asterix.runtime.operators.file.ADMDataParser.admFromLexerStream(ADMDataParser.java:329)
    at edu.uci.ics.asterix.runtime.operators.file.ADMDataParser.parseRecord(ADMDataParser.java:512)
    at edu.uci.ics.asterix.runtime.operators.file.ADMDataParser.admFromLexerStream(ADMDataParser.java:346)
    at edu.uci.ics.asterix.runtime.operators.file.ADMDataParser.parseAdmInstance(ADMDataParser.java:108)
    at edu.uci.ics.asterix.runtime.operators.file.ADMDataParser.parse(ADMDataParser.java:80)
    ... 8 more
Caused by: edu.uci.ics.asterix.runtime.operators.file.adm.AdmLexerException: 
Error while parsing.  Line: 25 Row: 87 Expecting: <CIRCLE_CONS>
    at edu.uci.ics.asterix.runtime.operators.file.adm.AdmLexer.parseError(AdmLexer.java:941)
    at edu.uci.ics.asterix.runtime.operators.file.adm.AdmLexer.next(AdmLexer.java:486)
    at edu.uci.ics.asterix.runtime.operators.file.ADMDataParser.nextToken(ADMDataParser.java:673)
    ... 16 more

edu.uci.ics.asterix.runtime.operators.file.adm.AdmLexerException

Original issue reported on code.google.com by ker...@gmail.com on 26 Mar 2014 at 3:01

GoogleCodeExporter commented 9 years ago
Hmm, in JSON (http://www.json.org/) '\"' is an escaped quote. I'm not sure 
right now if that's true for ADM as well. but I think that it probably should 
be. So I think that this should be an error.
What is the origin of the file? 
Is it supposed to be JSON?

Original comment by westm...@gmail.com on 29 Mar 2014 at 12:25

GoogleCodeExporter commented 9 years ago
Following an offline discussion, Till and I decided to vivisect this issue into 
ADM and AQL bugs. Additionally, the generator producing the ADM should escape 
all special characters from the char-production in the JSON grammar: 
\"
\\
\/
\b
\f
\n
\r
\t

Original comment by ker...@gmail.com on 29 Mar 2014 at 1:56

Attachments:

GoogleCodeExporter commented 9 years ago
The problem of escaping in AQL is covered by issue 748.

Original comment by westm...@gmail.com on 29 Mar 2014 at 3:47

GoogleCodeExporter commented 9 years ago
This issue covers the fixing of Keren's data generator.

The error in the AQL parser is covered by issue 748.
The extension of the ADM parser to support all JSON escapes is covered by issue 
752.
The extension of the AQL parser to support all JSON escapes is covered by issue 
753.

Original comment by westm...@gmail.com on 4 Apr 2014 at 7:10

GoogleCodeExporter commented 9 years ago
The parsing of a backslash in ADM is covered by issue 754.

Original comment by ker...@gmail.com on 4 Apr 2014 at 9:01

GoogleCodeExporter commented 9 years ago
The data generator is fixed.

Original comment by ker...@gmail.com on 4 Apr 2014 at 9:11