weso / CWR-DataApi

CWR-DataApi
MIT License
35 stars 30 forks source link

Getting Parse Exception while reading CWR file #175

Open avanishp2 opened 7 years ago

avanishp2 commented 7 years ago

Hi,

I am reading CWR file through this Data-api library. I am getting the following exception.

File "C:\Python34\lib\site-packages\pyparsing.py", line 2794, in parseImpl raise ParseException(instring, loc, self.errmsg, self) pyparsing.ParseException: Expected sd_type (at char 114), (line:2, col:27)

Have tested on Python 2.7, Python 3.4 and Python 3.6. Getting same exception. Any help would be appreciated.

Here's the full stack trace of the program.

D:\python_web_crawler>python cwr-convertor.py File to JSON test Please enter the full path to a CWR file (e.g. c:/documents/file.cwr): D:/MusicWorksDB/CW160035UN_DIG.V21 Please enter the full path to the file where the results will be stored: D:/MusicWorksDB

Reading file D:/MusicWorksDB/CW160035UN_DIG.V21 Storing output on D:/MusicWorksDB

Traceback (most recent call last): File "cwr-convertor.py", line 24, in data = decoder.decode(data) File "C:\Python34\lib\site-packages\cwr\parser\decoder\file.py", line 305, in decode transmission = self._file_decoder.decode(data['contents'])[0] File "C:\Python34\lib\site-packages\cwr\parser\decoder\common.py", line 90, in decode return self._grammar.parseString(text) File "C:\Python34\lib\site-packages\pyparsing.py", line 1632, in parseString raise exc File "C:\Python34\lib\site-packages\pyparsing.py", line 1622, in parseString loc, tokens = self._parse( instring, 0 ) File "C:\Python34\lib\site-packages\pyparsing.py", line 1379, in _parseNoCache loc,tokens = self.parseImpl( instring, preloc, doActions ) File "C:\Python34\lib\site-packages\pyparsing.py", line 3395, in parseImpl loc, exprtokens = e._parse( instring, loc, doActions ) File "C:\Python34\lib\site-packages\pyparsing.py", line 1379, in _parseNoCache loc,tokens = self.parseImpl( instring, preloc, doActions ) File "C:\Python34\lib\site-packages\pyparsing.py", line 3378, in parseImpl loc, resultlist = self.exprs[0]._parse( instring, loc, doActions, callPreParse=False ) File "C:\Python34\lib\site-packages\pyparsing.py", line 1379, in _parseNoCache loc,tokens = self.parseImpl( instring, preloc, doActions ) File "C:\Python34\lib\site-packages\pyparsing.py", line 3378, in parseImpl loc, resultlist = self.exprs[0]._parse( instring, loc, doActions, callPreParse=False ) File "C:\Python34\lib\site-packages\pyparsing.py", line 1379, in _parseNoCache loc,tokens = self.parseImpl( instring, preloc, doActions ) File "C:\Python34\lib\site-packages\pyparsing.py", line 3378, in parseImpl loc, resultlist = self.exprs[0]._parse( instring, loc, doActions, callPreParse=False ) File "C:\Python34\lib\site-packages\pyparsing.py", line 1379, in _parseNoCache loc,tokens = self.parseImpl( instring, preloc, doActions ) File "C:\Python34\lib\site-packages\pyparsing.py", line 3395, in parseImpl loc, exprtokens = e._parse( instring, loc, doActions ) File "C:\Python34\lib\site-packages\pyparsing.py", line 1379, in _parseNoCache loc,tokens = self.parseImpl( instring, preloc, doActions ) File "C:\Python34\lib\site-packages\pyparsing.py", line 3545, in parseImpl raise maxException File "C:\Python34\lib\site-packages\pyparsing.py", line 3530, in parseImpl ret = e._parse( instring, loc, doActions ) File "C:\Python34\lib\site-packages\pyparsing.py", line 1383, in _parseNoCache loc,tokens = self.parseImpl( instring, preloc, doActions ) File "C:\Python34\lib\site-packages\pyparsing.py", line 2794, in parseImpl raise ParseException(instring, loc, self.errmsg, self) pyparsing.ParseException: Expected sd_type (at char 114), (line:2, col:27)

Bernardo-MG commented 7 years ago

Thanks for reporting it. I'll take a look as soon as I can, which may take a few days as this is a side project for me.

On 5 June 2017 9:40:36 a.m. avanishp2 notifications@github.com wrote:

Hi,

I am reading CWR file through this Data-api library. I am getting the following exception.

File "C:\Python34\lib\site-packages\pyparsing.py", line 2794, in parseImpl raise ParseException(instring, loc, self.errmsg, self) pyparsing.ParseException: Expected sd_type (at char 114), (line:2, col:27)

Have tested on Python 2.7, Python 3.4 and Python 3.6. Getting same exception. Any help would be appreciated.

Here's the full stack trace of the program.

D:\python_web_crawler>python cwr-convertor.py File to JSON test Please enter the full path to a CWR file (e.g. c:/documents/file.cwr): D:/MusicWorksDB/CW160035UN_DIG.V21 Please enter the full path to the file where the results will be stored: D:/MusicWorksDB

Reading file D:/MusicWorksDB/CW160035UN_DIG.V21 Storing output on D:/MusicWorksDB

Traceback (most recent call last): File "cwr-convertor.py", line 24, in data = decoder.decode(data) File "C:\Python34\lib\site-packages\cwr\parser\decoder\file.py", line 305, in decode transmission = self._file_decoder.decode(data['contents'])[0] File "C:\Python34\lib\site-packages\cwr\parser\decoder\common.py", line 90, in decode return self._grammar.parseString(text) File "C:\Python34\lib\site-packages\pyparsing.py", line 1632, in parseString raise exc File "C:\Python34\lib\site-packages\pyparsing.py", line 1622, in parseString loc, tokens = self._parse( instring, 0 ) File "C:\Python34\lib\site-packages\pyparsing.py", line 1379, in _parseNoCache loc,tokens = self.parseImpl( instring, preloc, doActions ) File "C:\Python34\lib\site-packages\pyparsing.py", line 3395, in parseImpl loc, exprtokens = e._parse( instring, loc, doActions ) File "C:\Python34\lib\site-packages\pyparsing.py", line 1379, in _parseNoCache loc,tokens = self.parseImpl( instring, preloc, doActions ) File "C:\Python34\lib\site-packages\pyparsing.py", line 3378, in parseImpl loc, resultlist = self.exprs[0]._parse( instring, loc, doActions, callPreParse=False ) File "C:\Python34\lib\site-packages\pyparsing.py", line 1379, in _parseNoCache loc,tokens = self.parseImpl( instring, preloc, doActions ) File "C:\Python34\lib\site-packages\pyparsing.py", line 3378, in parseImpl loc, resultlist = self.exprs[0]._parse( instring, loc, doActions, callPreParse=False ) File "C:\Python34\lib\site-packages\pyparsing.py", line 1379, in _parseNoCache loc,tokens = self.parseImpl( instring, preloc, doActions ) File "C:\Python34\lib\site-packages\pyparsing.py", line 3378, in parseImpl loc, resultlist = self.exprs[0]._parse( instring, loc, doActions, callPreParse=False ) File "C:\Python34\lib\site-packages\pyparsing.py", line 1379, in _parseNoCache loc,tokens = self.parseImpl( instring, preloc, doActions ) File "C:\Python34\lib\site-packages\pyparsing.py", line 3395, in parseImpl loc, exprtokens = e._parse( instring, loc, doActions ) File "C:\Python34\lib\site-packages\pyparsing.py", line 1379, in _parseNoCache loc,tokens = self.parseImpl( instring, preloc, doActions ) File "C:\Python34\lib\site-packages\pyparsing.py", line 3545, in parseImpl raise maxException File "C:\Python34\lib\site-packages\pyparsing.py", line 3530, in parseImpl ret = e._parse( instring, loc, doActions ) File "C:\Python34\lib\site-packages\pyparsing.py", line 1383, in _parseNoCache loc,tokens = self.parseImpl( instring, preloc, doActions ) File "C:\Python34\lib\site-packages\pyparsing.py", line 2794, in parseImpl raise ParseException(instring, loc, self.errmsg, self) pyparsing.ParseException: Expected sd_type (at char 114), (line:2, col:27)

-- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/weso/CWR-DataApi/issues/175

Bernardo-MG commented 7 years ago

Sorry I couldn't take a look sooner.

For what I can gather a line in the file is missing the SD Type, composed of two alphanumeric letters at the end of a group header (a GRH row).

Could you please verify that?

The parser is very strict, so something like that can break the parsing.

avanishp2 commented 7 years ago

I have provided different CWR file including the one you have included in tests/example folder as input. Then also I am getting the same error. There is a note in CWR functional document that states "Submission / Distribution Type is used only in the case of audio-visual transactions. This field will be ignored for CWR transactions".

SD_Type is a non mandatory field according to CWR functional document.

Bernardo-MG commented 7 years ago

I've set the SD type as optional, also uploaded a new version to Pypi with the latest changes. Could you try it now?

Sorry it is taking so long, but I do this on my spare time.

Bernardo-MG commented 7 years ago

After taking a better look, the problem won't be solved in the short term. There are some problems with the grammar used by the parser and acknowledgement files, which are related to this issue.