Closed dominicmauro closed 8 years ago
Have you tried checking your CSV with http://csvlint.io/?
The full CSV is too large to validate with that service (2GB), but the half-dozen rows where the parser stopped (from above) were successfully validated.
There's also that 3.2.0 is from January, so please try either 3.2.2 or to compile pgloader from most recent sources. We have been fixing some CSV parsing infelicities this year, so it might be that just picking up a more recent version will fix your problem here.
Sorry, that was a rookie mistake. Downloaded from Github and compiled from source this morning. pgloader describes itself as "3.2.1~devel" and was still compiled with SBCL 1.2.16.
I'm still seeing the same error on the same line, unfortunately.
I guess you got a github zip file from an old release still, can you please confirm you did git clone
or make sure you actually do that then compile from git
current sources?
Okay, that was a rookie mistake. But I still get the same error in the same spot.
2015-11-25T14:51:46.213000-05:00 ERROR :WAITING-FOR-NEXT fell through ECASE expression.
Wanted one of (:WAITING :COLLECTING-QUOTED :COLLECTING).
table name read imported errors total time read write
----------------- --------- --------- --------- -------------- --------- ---------
fetch 0 0 0 0.017s
before load 2 2 0 0.041s
----------------- --------- --------- --------- -------------- --------- ---------
campaign15 84643 84643 0 4.018s 3.804s 7.789s
----------------- --------- --------- --------- -------------- --------- ---------
Total import time 84643 84643 0 4.428s 3.804s 7.789s
➜ data pgloader --version
pgloader version "3.2.533a49a"
compiled with SBCL 1.2.16
Can you reduce it to the couple of lines that introduce the failure, and also double check the CSV options available to drive the parsing? To help here I will need a file to reproduce the bug and the LOAD command you are using.
I can't seem to reproduce the error with a few lines; not sure why that's the case. However, here's a zip file with the LOAD command I'm using and the first hundred thousand rows of the CSV I'm working with. pgloader
LOADs the first 84643 rows before crashing with the "WAITING-FOR-NEXT fell through ECASE expression" error.
Here's what I get with your test case:
CL-USER> (pgloader:run-commands "/Users/dim/dev/temp/313/contributions.load"
:client-min-messages :warning)
2015-11-27T11:41:25.000000+01:00 LOG Parsing commands from file #P"/Users/dim/dev/temp/313/contributions.load"
2015-11-27T11:41:25.618000+01:00 ERROR We finished reading a quoted value and got more characters before a separator or EOL 180
"A00151","E","F","2007","1573","10/17/2007","","","","AMERICAN FOOD & VENDING CORP","","","","197 FRANKLIN ST","AUBURN","NY","13021","CASH","","198.75","","","","OTHER","","LUNCH "MEET & GREET" COUNTY LEGISLAT","","","TH","10/23/2007 09:15:24"
2015-11-27T11:41:25.618000+01:00 ERROR We finished reading a quoted value and got more characters before a separator or EOL 55
"A00157","K","F","2000","1595","03/03/1900","","","",""BEST SALINA GOVERNMENT"","","","","118 HARDING AVE. SOUTH","LIVERPOOL","NY","13088","1231","","25","","","","CNTRB","","","","","CF","07/10/1900 09:51:29"
2015-11-27T11:41:25.618000+01:00 ERROR We finished reading a quoted value and got more characters before a separator or EOL 72
"A00157","J","F","2006","1780","12/06/2005","","","","TRIBUTE TO JAMES "JIM" O'HARE","","","","PO BOX 9108","ALBANY","NY","12209","1492","","65","","","","CNTRB","","","","","CF","01/06/2006 11:33:20"
2015-11-27T11:41:25.618000+01:00 ERROR We finished reading a quoted value and got more characters before a separator or EOL 55
"A00157","K","F","2009","2083","07/01/2009","","","",""JENNINGS 2009"","","","","PO BOX 7103","ALBANY","NY","12224","1630","","2500","","","","CNTRB","","","","","CF","07/13/2009 14:50:01"
2015-11-27T11:41:25.818000+01:00 ERROR We finished reading a quoted value and got more characters before a separator or EOL 76
"A00166","K","G","1999","1553","01/12/1999","","","","OPEIU LOCAL 153 VOTE "VOICE OF THE ELECTORATE"","","","","265 WEST 14TH STREET","NEW YORK","NY","10011","","","5880.4","","","","","","","2","","GLB","07/12/1999 17:18:13"
2015-11-27T11:41:26.823000+01:00 ERROR We finished reading a quoted value and got more characters before a separator or EOL 188
"A00191","F","F","2008","14","10/21/2008","","","","PATHFINDER COMMUNICATIONS","","","","603 SWEDESFORD ROAD","MALVERN","PA","19355","4615","","11100","","","","CNTRB","","08-NC-002_AD17 "MCKEVITT2"","","","","10/21/2008 13:25:21"
2015-11-27T11:41:28.026000+01:00 ERROR We finished reading a quoted value and got more characters before a separator or EOL 79
"A00193","K","B","2007","6450","06/28/2007","","CORP","","NY PAWNBROKERS INC. "A"","","","","C/O 350 NORTHERN BLVD.","ALBANY","NY","12204","1882","","2500","0","","","","","","","","","07/13/2007 00:00:00"
2015-11-27T11:41:29.028000+01:00 ERROR We finished reading a quoted value and got more characters before a separator or EOL 185
"A00243","K","F","1999","1587","04/16/1999","","","","D.A.C.C","","","","107 WASHINGTON AVE-SUITE 1LL","ALBANY","NY","12210","2877","","125","","","","CNTRB","","'SPRING FLING"","","","A00243","08/25/1999 20:25:03"
2015-11-27T11:41:29.231000+01:00 ERROR :WAITING-FOR-NEXT fell through ECASE expression.
Wanted one of (:WAITING :COLLECTING-QUOTED :COLLECTING).
table name read imported errors total time read write
----------------- --------- --------- --------- -------------- --------- ---------
fetch 0 0 0 0.003s
before load 2 2 0 0.037s
----------------- --------- --------- --------- -------------- --------- ---------
sample2 84643 84643 0 4.032s 3.885s 7.941s
----------------- --------- --------- --------- -------------- --------- ---------
Total import time 84643 84643 0 4.232s 3.885s 7.941s
Given the line in context of the first error being:
"A00151","E","F","2007","1573","10/17/2007","","","","AMERICAN FOOD & VENDING CORP","","","","197 FRANKLIN ST","AUBURN","NY","13021","CASH","","198.75","","","","OTHER","","LUNCH "MEET & GREET" COUNTY LEGISLAT","","","TH","10/23/2007 09:15:24"
It strikes me as the best we can do... because "LUNCH "MEET & GREET" COUNTY LEGISLAT"
is not properly escaping the double quotes.
I tried the new version on my sample data (from above) and get the following new error:
➜ data pgloader cc.load
2015-12-01T11:26:36.022000-05:00 LOG Main logs in '/private/tmp/pgloader/pgloader.log'
2015-12-01T11:26:36.025000-05:00 LOG Data errors in '/private/tmp/pgloader/'
2015-12-01T11:26:36.025000-05:00 LOG Parsing commands from file #P"/Users/dom/data/cc.load"
debugger invoked on a UNBOUND-SLOT in thread
#<THREAD "lparallel" RUNNING {10091240B3}>:
The slot PGLOADER.CONNECTION:PATH is unbound in the object
#<CSV-CONNECTION csv://FILENAME: {1008440223}>.
^CAn unhandled error condition has been signalled:
Interactive interrupt at #x7FFF9A6322B2.
pgloader appears to hang at the #<CSV-CONNECTION csv://FILENAME: {1008440223}>.
line, and I eventually have to Ctrl-C to move on.
So I just tried again here and it went just fine. Can you provide both --version
and --debug
level output please?
./build/bin/pgloader /Users/dim/dev/temp/313/contributions.load
2015-12-02T12:50:41.047000+01:00 LOG Main logs in '/private/tmp/pgloader/pgloader.log'
2015-12-02T12:50:41.052000+01:00 LOG Data errors in '/private/tmp/pgloader/'
2015-12-02T12:50:41.052000+01:00 LOG Parsing commands from file #P"/Users/dim/dev/temp/313/contributions.load"
2015-12-02T12:50:41.463000+01:00 ERROR We finished reading a quoted value and got more characters before a separator or EOL 180
"A00151","E","F","2007","1573","10/17/2007","","","","AMERICAN FOOD & VENDING CORP","","","","197 FRANKLIN ST","AUBURN","NY","13021","CASH","","198.75","","","","OTHER","","LUNCH "MEET & GREET" COUNTY LEGISLAT","","","TH","10/23/2007 09:15:24"
2015-12-02T12:50:41.463000+01:00 ERROR We finished reading a quoted value and got more characters before a separator or EOL 55
"A00157","K","F","2000","1595","03/03/1900","","","",""BEST SALINA GOVERNMENT"","","","","118 HARDING AVE. SOUTH","LIVERPOOL","NY","13088","1231","","25","","","","CNTRB","","","","","CF","07/10/1900 09:51:29"
2015-12-02T12:50:41.464000+01:00 ERROR We finished reading a quoted value and got more characters before a separator or EOL 72
"A00157","J","F","2006","1780","12/06/2005","","","","TRIBUTE TO JAMES "JIM" O'HARE","","","","PO BOX 9108","ALBANY","NY","12209","1492","","65","","","","CNTRB","","","","","CF","01/06/2006 11:33:20"
2015-12-02T12:50:41.464000+01:00 ERROR We finished reading a quoted value and got more characters before a separator or EOL 55
"A00157","K","F","2009","2083","07/01/2009","","","",""JENNINGS 2009"","","","","PO BOX 7103","ALBANY","NY","12224","1630","","2500","","","","CNTRB","","","","","CF","07/13/2009 14:50:01"
2015-12-02T12:50:41.664000+01:00 ERROR We finished reading a quoted value and got more characters before a separator or EOL 76
"A00166","K","G","1999","1553","01/12/1999","","","","OPEIU LOCAL 153 VOTE "VOICE OF THE ELECTORATE"","","","","265 WEST 14TH STREET","NEW YORK","NY","10011","","","5880.4","","","","","","","2","","GLB","07/12/1999 17:18:13"
2015-12-02T12:50:42.487000+01:00 ERROR We finished reading a quoted value and got more characters before a separator or EOL 188
"A00191","F","F","2008","14","10/21/2008","","","","PATHFINDER COMMUNICATIONS","","","","603 SWEDESFORD ROAD","MALVERN","PA","19355","4615","","11100","","","","CNTRB","","08-NC-002_AD17 "MCKEVITT2"","","","","10/21/2008 13:25:21"
2015-12-02T12:50:43.289000+01:00 ERROR We finished reading a quoted value and got more characters before a separator or EOL 79
"A00193","K","B","2007","6450","06/28/2007","","CORP","","NY PAWNBROKERS INC. "A"","","","","C/O 350 NORTHERN BLVD.","ALBANY","NY","12204","1882","","2500","0","","","","","","","","","07/13/2007 00:00:00"
2015-12-02T12:50:43.892000+01:00 ERROR We finished reading a quoted value and got more characters before a separator or EOL 185
"A00243","K","F","1999","1587","04/16/1999","","","","D.A.C.C","","","","107 WASHINGTON AVE-SUITE 1LL","ALBANY","NY","12210","2877","","125","","","","CNTRB","","'SPRING FLING"","","","A00243","08/25/1999 20:25:03"
2015-12-02T12:50:43.893000+01:00 FATAL :WAITING-FOR-NEXT fell through ECASE expression.
Wanted one of (:WAITING :COLLECTING-QUOTED :COLLECTING).
table name read imported errors total time read write
----------------- --------- --------- --------- -------------- --------- ---------
fetch 0 0 0 0.010s
before load 2 2 0 0.023s
----------------- --------- --------- --------- -------------- --------- ---------
sample2 84643 84643 9 3.077s 2.702s 0.672s
----------------- --------- --------- --------- -------------- --------- ---------
Total import time 84643 84643 9 3.303s 2.702s 0.672s
Closing the issue to clean-up. Consider re-opening if it's still a problem for you.
The specific error I get is:
Here's a log of the operation, with the first 84,000 rows of my CSV snipped out, and with
--verbose
and--debug
enabled. And here's a sample of the CSV itself, in case this is a problem with malformed data, again. (I have ugly, ugly data.)Running this on OS X 10.11, with pgloader version 3.2.0 compiled with SBCL 1.2.15.