Closed usbrandon closed 5 years ago
We should try with double quotes around the offending data. There is no CSV file standard but if there was, the Commons CSV code is very close to it.
Working with the last patch, give it a shot.
2525114547,N/A335,335,N/A
2525114547,N/A1035,1035,N/A
2525114547,N/A1062,1062,N/A
2525114547,N/A1227,1227,N/A
2525114547,"N/A1560
N/A1634
N/A5074
N/A5023",1560,N/A
2525114547,N/A1634,1634,N/A
2525114547,N/A5074,5074,N/A
2525114547,N/A5023,5023,N/A
2525114547,N/A5030,5030,N/A
2525114547,N/A5037,5037,N/A
2525114547,N/A5043,5043,N/A
2525114547,N/A5045,5045,N/A
2525114547,N/A5047,5047,N/A
2525114547,N/A5013,5013,N/A
2525114547,N/A5014,5014,N/A
2525114547,N/A5015,5015,N/A
2525114547,N/A5017,5017,N/A
Not fixed yet. Please try with this CSV. Had to rename to .txt to make github happy. PDI-17034_checksum_case-Input.txt
-- I get a null pointer exception "2" coming back when trying to view or run this dataset.
org.pentaho.di.core.exception.KettleException: Unable to get all rows for CSV data set 'PDI-17034_checksum_case-input' 2
at org.pentaho.di.dataset.DataSetCsvGroup.getAllRows(DataSetCsvGroup.java:131)
at org.pentaho.di.dataset.DataSetGroup.getAllRows(DataSetGroup.java:113)
at org.pentaho.di.dataset.DataSet.getAllRows(DataSet.java:144)
at org.pentaho.di.dataset.spoon.dialog.DataSetDialog.viewData(DataSetDialog.java:566)
at org.pentaho.di.dataset.spoon.dialog.DataSetDialog$7.handleEvent(DataSetDialog.java:337)
at org.eclipse.swt.widgets.EventTable.sendEvent(Unknown Source)
at org.eclipse.swt.widgets.Widget.sendEvent(Unknown Source)
at org.eclipse.swt.widgets.Display.runDeferredEvents(Unknown Source)
at org.eclipse.swt.widgets.Display.readAndDispatch(Unknown Source)
at org.pentaho.di.dataset.spoon.dialog.DataSetDialog.open(DataSetDialog.java:369)
at org.pentaho.di.dataset.spoon.DataSetHelper.editDataSet(DataSetHelper.java:376)
at org.pentaho.di.dataset.spoon.DataSetHelper.editDataSet(DataSetHelper.java:365)
at org.pentaho.di.dataset.spoon.xtpoint.ShowUnitTestMenuExtensionPoint.lambda$callExtensionPoint$5(ShowUnitTestMenuExtensionPoint.java:101)
at org.eclipse.swt.widgets.EventTable.sendEvent(Unknown Source)
at org.eclipse.swt.widgets.Widget.sendEvent(Unknown Source)
at org.eclipse.swt.widgets.Display.runDeferredEvents(Unknown Source)
at org.eclipse.swt.widgets.Display.readAndDispatch(Unknown Source)
at org.pentaho.di.ui.spoon.Spoon.readAndDispatch(Spoon.java:1381)
at org.pentaho.di.ui.spoon.Spoon.waitForDispose(Spoon.java:7817)
at org.pentaho.di.ui.spoon.Spoon.start(Spoon.java:9179)
at org.pentaho.di.ui.spoon.Spoon.main(Spoon.java:707)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.pentaho.commons.launcher.Launcher.main(Launcher.java:92)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 2 at org.apache.commons.csv.CSVRecord.get(CSVRecord.java:79) at org.pentaho.di.dataset.DataSetCsvGroup.getAllRows(DataSetCsvGroup.java:121) ... 25 more
Sorry, Brandon, why would you think that simply copying any CSV file into the datasets would work? Please only create datasets with the plugin. Alternatively, apply proper quoting in the file.
Plugin Version 3.4.2
In the ktr attached to the JIRA below, there is a Data Grid step where an input has carrage returns within one of the columns. This must be confusing the CSV Reader for the unit test and it gives up and refuses to import the data. Removing the offending row solves the issue.
https://jira.pentaho.com/browse/PDI-17034
Console output of the CSV file used for input.