dimitri / pgloader

Migrate to PostgreSQL in a single command!
http://pgloader.io
Other
5.43k stars 546 forks source link

ERROR :WAITING-FOR-NEXT fell through ECASE expression #313

Closed dominicmauro closed 8 years ago

dominicmauro commented 8 years ago

The specific error I get is:

2015-11-24T16:56:16.654000-05:00 ERROR :WAITING-FOR-NEXT fell through ECASE expression. Wanted one of (:WAITING :COLLECTING-QUOTED :COLLECTING).

Here's a log of the operation, with the first 84,000 rows of my CSV snipped out, and with --verbose and --debug enabled. And here's a sample of the CSV itself, in case this is a problem with malformed data, again. (I have ugly, ugly data.)

Running this on OS X 10.11, with pgloader version 3.2.0 compiled with SBCL 1.2.15.

jqnatividad commented 8 years ago

Have you tried checking your CSV with http://csvlint.io/?

dominicmauro commented 8 years ago

The full CSV is too large to validate with that service (2GB), but the half-dozen rows where the parser stopped (from above) were successfully validated.

dimitri commented 8 years ago

There's also that 3.2.0 is from January, so please try either 3.2.2 or to compile pgloader from most recent sources. We have been fixing some CSV parsing infelicities this year, so it might be that just picking up a more recent version will fix your problem here.

dominicmauro commented 8 years ago

Sorry, that was a rookie mistake. Downloaded from Github and compiled from source this morning. pgloader describes itself as "3.2.1~devel" and was still compiled with SBCL 1.2.16.

I'm still seeing the same error on the same line, unfortunately.

dimitri commented 8 years ago

I guess you got a github zip file from an old release still, can you please confirm you did git clone or make sure you actually do that then compile from git current sources?

dominicmauro commented 8 years ago

Okay, that was a rookie mistake. But I still get the same error in the same spot.

2015-11-25T14:51:46.213000-05:00 ERROR :WAITING-FOR-NEXT fell through ECASE expression.
Wanted one of (:WAITING :COLLECTING-QUOTED :COLLECTING).
       table name       read   imported     errors      total time       read      write
-----------------  ---------  ---------  ---------  --------------  ---------  ---------
            fetch          0          0          0          0.017s                     
      before load          2          2          0          0.041s                     
-----------------  ---------  ---------  ---------  --------------  ---------  ---------
       campaign15      84643      84643          0          4.018s     3.804s    7.789s
-----------------  ---------  ---------  ---------  --------------  ---------  ---------
Total import time      84643      84643          0          4.428s     3.804s    7.789s

➜  data  pgloader --version
pgloader version "3.2.533a49a"
compiled with SBCL 1.2.16
dimitri commented 8 years ago

Can you reduce it to the couple of lines that introduce the failure, and also double check the CSV options available to drive the parsing? To help here I will need a file to reproduce the bug and the LOAD command you are using.

dominicmauro commented 8 years ago

I can't seem to reproduce the error with a few lines; not sure why that's the case. However, here's a zip file with the LOAD command I'm using and the first hundred thousand rows of the CSV I'm working with. pgloader LOADs the first 84643 rows before crashing with the "WAITING-FOR-NEXT fell through ECASE expression" error.

dimitri commented 8 years ago

Here's what I get with your test case:

CL-USER> (pgloader:run-commands "/Users/dim/dev/temp/313/contributions.load"
                                :client-min-messages :warning)
2015-11-27T11:41:25.000000+01:00 LOG Parsing commands from file #P"/Users/dim/dev/temp/313/contributions.load"
2015-11-27T11:41:25.618000+01:00 ERROR We finished reading a quoted value and got more characters before a separator or EOL 180
"A00151","E","F","2007","1573","10/17/2007","","","","AMERICAN FOOD & VENDING CORP","","","","197 FRANKLIN ST","AUBURN","NY","13021","CASH","","198.75","","","","OTHER","","LUNCH "MEET & GREET" COUNTY LEGISLAT","","","TH","10/23/2007 09:15:24"
2015-11-27T11:41:25.618000+01:00 ERROR We finished reading a quoted value and got more characters before a separator or EOL 55
"A00157","K","F","2000","1595","03/03/1900","","","",""BEST SALINA GOVERNMENT"","","","","118 HARDING AVE. SOUTH","LIVERPOOL","NY","13088","1231","","25","","","","CNTRB","","","","","CF","07/10/1900 09:51:29"
2015-11-27T11:41:25.618000+01:00 ERROR We finished reading a quoted value and got more characters before a separator or EOL 72
"A00157","J","F","2006","1780","12/06/2005","","","","TRIBUTE TO JAMES "JIM" O'HARE","","","","PO BOX 9108","ALBANY","NY","12209","1492","","65","","","","CNTRB","","","","","CF","01/06/2006 11:33:20"
2015-11-27T11:41:25.618000+01:00 ERROR We finished reading a quoted value and got more characters before a separator or EOL 55
"A00157","K","F","2009","2083","07/01/2009","","","",""JENNINGS 2009"","","","","PO BOX 7103","ALBANY","NY","12224","1630","","2500","","","","CNTRB","","","","","CF","07/13/2009 14:50:01"
2015-11-27T11:41:25.818000+01:00 ERROR We finished reading a quoted value and got more characters before a separator or EOL 76
"A00166","K","G","1999","1553","01/12/1999","","","","OPEIU LOCAL 153 VOTE "VOICE OF THE ELECTORATE"","","","","265 WEST 14TH STREET","NEW YORK","NY","10011","","","5880.4","","","","","","","2","","GLB","07/12/1999 17:18:13"
2015-11-27T11:41:26.823000+01:00 ERROR We finished reading a quoted value and got more characters before a separator or EOL 188
"A00191","F","F","2008","14","10/21/2008","","","","PATHFINDER COMMUNICATIONS","","","","603 SWEDESFORD ROAD","MALVERN","PA","19355","4615","","11100","","","","CNTRB","","08-NC-002_AD17 "MCKEVITT2"","","","","10/21/2008 13:25:21"
2015-11-27T11:41:28.026000+01:00 ERROR We finished reading a quoted value and got more characters before a separator or EOL 79
"A00193","K","B","2007","6450","06/28/2007","","CORP","","NY PAWNBROKERS INC. "A"","","","","C/O 350 NORTHERN BLVD.","ALBANY","NY","12204","1882","","2500","0","","","","","","","","","07/13/2007 00:00:00"
2015-11-27T11:41:29.028000+01:00 ERROR We finished reading a quoted value and got more characters before a separator or EOL 185
"A00243","K","F","1999","1587","04/16/1999","","","","D.A.C.C","","","","107 WASHINGTON AVE-SUITE 1LL","ALBANY","NY","12210","2877","","125","","","","CNTRB","","'SPRING FLING"","","","A00243","08/25/1999 20:25:03"
2015-11-27T11:41:29.231000+01:00 ERROR :WAITING-FOR-NEXT fell through ECASE expression.
Wanted one of (:WAITING :COLLECTING-QUOTED :COLLECTING).
       table name       read   imported     errors      total time       read      write
-----------------  ---------  ---------  ---------  --------------  ---------  ---------
            fetch          0          0          0          0.003s                     
      before load          2          2          0          0.037s                     
-----------------  ---------  ---------  ---------  --------------  ---------  ---------
          sample2      84643      84643          0          4.032s     3.885s    7.941s
-----------------  ---------  ---------  ---------  --------------  ---------  ---------
Total import time      84643      84643          0          4.232s     3.885s    7.941s

Given the line in context of the first error being:

"A00151","E","F","2007","1573","10/17/2007","","","","AMERICAN FOOD & VENDING CORP","","","","197 FRANKLIN ST","AUBURN","NY","13021","CASH","","198.75","","","","OTHER","","LUNCH "MEET & GREET" COUNTY LEGISLAT","","","TH","10/23/2007 09:15:24"

It strikes me as the best we can do... because "LUNCH "MEET & GREET" COUNTY LEGISLAT" is not properly escaping the double quotes.

dominicmauro commented 8 years ago

I tried the new version on my sample data (from above) and get the following new error:

➜  data  pgloader cc.load
2015-12-01T11:26:36.022000-05:00 LOG Main logs in '/private/tmp/pgloader/pgloader.log'
2015-12-01T11:26:36.025000-05:00 LOG Data errors in '/private/tmp/pgloader/'
2015-12-01T11:26:36.025000-05:00 LOG Parsing commands from file #P"/Users/dom/data/cc.load"

debugger invoked on a UNBOUND-SLOT in thread
#<THREAD "lparallel" RUNNING {10091240B3}>:
  The slot PGLOADER.CONNECTION:PATH is unbound in the object
  #<CSV-CONNECTION csv://FILENAME: {1008440223}>.
^CAn unhandled error condition has been signalled:
   Interactive interrupt at #x7FFF9A6322B2.

pgloader appears to hang at the #<CSV-CONNECTION csv://FILENAME: {1008440223}>. line, and I eventually have to Ctrl-C to move on.

dimitri commented 8 years ago

So I just tried again here and it went just fine. Can you provide both --version and --debug level output please?

./build/bin/pgloader /Users/dim/dev/temp/313/contributions.load
2015-12-02T12:50:41.047000+01:00 LOG Main logs in '/private/tmp/pgloader/pgloader.log'
2015-12-02T12:50:41.052000+01:00 LOG Data errors in '/private/tmp/pgloader/'
2015-12-02T12:50:41.052000+01:00 LOG Parsing commands from file #P"/Users/dim/dev/temp/313/contributions.load"
2015-12-02T12:50:41.463000+01:00 ERROR We finished reading a quoted value and got more characters before a separator or EOL 180
"A00151","E","F","2007","1573","10/17/2007","","","","AMERICAN FOOD & VENDING CORP","","","","197 FRANKLIN ST","AUBURN","NY","13021","CASH","","198.75","","","","OTHER","","LUNCH "MEET & GREET" COUNTY LEGISLAT","","","TH","10/23/2007 09:15:24"
2015-12-02T12:50:41.463000+01:00 ERROR We finished reading a quoted value and got more characters before a separator or EOL 55
"A00157","K","F","2000","1595","03/03/1900","","","",""BEST SALINA GOVERNMENT"","","","","118 HARDING AVE. SOUTH","LIVERPOOL","NY","13088","1231","","25","","","","CNTRB","","","","","CF","07/10/1900 09:51:29"
2015-12-02T12:50:41.464000+01:00 ERROR We finished reading a quoted value and got more characters before a separator or EOL 72
"A00157","J","F","2006","1780","12/06/2005","","","","TRIBUTE TO JAMES "JIM" O'HARE","","","","PO BOX 9108","ALBANY","NY","12209","1492","","65","","","","CNTRB","","","","","CF","01/06/2006 11:33:20"
2015-12-02T12:50:41.464000+01:00 ERROR We finished reading a quoted value and got more characters before a separator or EOL 55
"A00157","K","F","2009","2083","07/01/2009","","","",""JENNINGS 2009"","","","","PO BOX 7103","ALBANY","NY","12224","1630","","2500","","","","CNTRB","","","","","CF","07/13/2009 14:50:01"
2015-12-02T12:50:41.664000+01:00 ERROR We finished reading a quoted value and got more characters before a separator or EOL 76
"A00166","K","G","1999","1553","01/12/1999","","","","OPEIU LOCAL 153 VOTE "VOICE OF THE ELECTORATE"","","","","265 WEST 14TH STREET","NEW YORK","NY","10011","","","5880.4","","","","","","","2","","GLB","07/12/1999 17:18:13"
2015-12-02T12:50:42.487000+01:00 ERROR We finished reading a quoted value and got more characters before a separator or EOL 188
"A00191","F","F","2008","14","10/21/2008","","","","PATHFINDER COMMUNICATIONS","","","","603 SWEDESFORD ROAD","MALVERN","PA","19355","4615","","11100","","","","CNTRB","","08-NC-002_AD17 "MCKEVITT2"","","","","10/21/2008 13:25:21"
2015-12-02T12:50:43.289000+01:00 ERROR We finished reading a quoted value and got more characters before a separator or EOL 79
"A00193","K","B","2007","6450","06/28/2007","","CORP","","NY PAWNBROKERS INC. "A"","","","","C/O 350 NORTHERN BLVD.","ALBANY","NY","12204","1882","","2500","0","","","","","","","","","07/13/2007 00:00:00"
2015-12-02T12:50:43.892000+01:00 ERROR We finished reading a quoted value and got more characters before a separator or EOL 185
"A00243","K","F","1999","1587","04/16/1999","","","","D.A.C.C","","","","107 WASHINGTON AVE-SUITE 1LL","ALBANY","NY","12210","2877","","125","","","","CNTRB","","'SPRING FLING"","","","A00243","08/25/1999 20:25:03"
2015-12-02T12:50:43.893000+01:00 FATAL :WAITING-FOR-NEXT fell through ECASE expression.
Wanted one of (:WAITING :COLLECTING-QUOTED :COLLECTING).
       table name       read   imported     errors      total time       read      write
-----------------  ---------  ---------  ---------  --------------  ---------  ---------
            fetch          0          0          0          0.010s                     
      before load          2          2          0          0.023s                     
-----------------  ---------  ---------  ---------  --------------  ---------  ---------
          sample2      84643      84643          9          3.077s     2.702s    0.672s
-----------------  ---------  ---------  ---------  --------------  ---------  ---------
Total import time      84643      84643          9          3.303s     2.702s    0.672s
dimitri commented 8 years ago

Closing the issue to clean-up. Consider re-opening if it's still a problem for you.