Closed omerlh closed 5 months ago
Any update on this issue?
Please look at this commit. I have tried to reproduce your error but it seems to run fine, catch the error passed in the "skip" event.
There is a code that reproduces the issue on my side.
import path from 'path';
import { pipeline } from 'stream/promises';
import { parse as parseCSV } from 'csv-parse';
import { Writable } from 'stream';
import { createReadStream } from 'fs';
async function testRecordsSkip() {
const errors: any = [];
const records: any = [];
const sink = new Writable({
objectMode: true,
write: (_, __, callback) => {
records.push(_);
callback();
},
});
const csvSource = createReadStream(path.join(__dirname, 'test_with_bom.csv'));
const parser = parseCSV({
skip_records_with_error: true,
bom: true,
});
parser.on('skip', function (err) {
errors.push(err);
});
await pipeline(csvSource, parser, sink);
console.log({
records,
errors,
});
}
testRecordsSkip().catch(console.error);
The output is
{
records: [
[ 'id', 'first_name', 'last_name', 'email', 'modified_at' ],
[ '1', 'Ring', 'Grinyov', 'rgrinyov0@weebly.com', '2022-02-14' ],
[ '3', 'Cammi', 'Bendix', 'cbendix2@tuttocitta.it', '2022-02-14' ]
],
errors: []
}
Test file is there test_with_bom.csv
I tried again with your code and the output is still what we expect:
{
records: [
[ 'id', 'first_name', 'last_name', 'email', 'modified_at' ],
[ '1', 'Ring', 'Grinyov', 'rgrinyov0@weebly.com', '2022-02-14' ],
[ '3', 'Cammi', 'Bendix', 'cbendix2@tuttocitta.it', '2022-02-14' ]
],
errors: [
CsvError: Invalid Record Length: expect 5, got 6 on line 3
at Object.__onRecord (file:///Users/david/projects/open-source/csv/csv/packages/csv-parse/lib/api/index.js:364:11)
at Object.parse (file:///Users/david/projects/open-source/csv/csv/packages/csv-parse/lib/api/index.js:247:40)
at Parser._transform (file:///Users/david/projects/open-source/csv/csv/packages/csv-parse/lib/index.js:31:26)
at Transform._write (node:internal/streams/transform:175:8)
at writeOrBuffer (node:internal/streams/writable:392:12)
at _write (node:internal/streams/writable:333:10)
at Writable.write (node:internal/streams/writable:337:10)
at ReadStream.ondata (node:internal/streams/readable:766:22)
at ReadStream.emit (node:events:514:28)
at addChunk (node:internal/streams/readable:324:12) {
code: 'CSV_RECORD_INCONSISTENT_FIELDS_LENGTH',
bytes: 141,
comment_lines: 0,
empty_lines: 0,
invalid_field_length: 0,
lines: 3,
records: 2,
columns: false,
error: undefined,
header: false,
index: 6,
raw: undefined,
column: 6,
quoting: false,
record: [Array]
}
]
}
Note, tested with Node.js versions "v16.13.0", "v18.17.1", "v20.5.1" and "v21.5.0".
I tried again with your code and the output is still what we expect:
{ records: [ [ 'id', 'first_name', 'last_name', 'email', 'modified_at' ], [ '1', 'Ring', 'Grinyov', 'rgrinyov0@weebly.com', '2022-02-14' ], [ '3', 'Cammi', 'Bendix', 'cbendix2@tuttocitta.it', '2022-02-14' ] ], errors: [ CsvError: Invalid Record Length: expect 5, got 6 on line 3 at Object.__onRecord (file:///Users/david/projects/open-source/csv/csv/packages/csv-parse/lib/api/index.js:364:11) at Object.parse (file:///Users/david/projects/open-source/csv/csv/packages/csv-parse/lib/api/index.js:247:40) at Parser._transform (file:///Users/david/projects/open-source/csv/csv/packages/csv-parse/lib/index.js:31:26) at Transform._write (node:internal/streams/transform:175:8) at writeOrBuffer (node:internal/streams/writable:392:12) at _write (node:internal/streams/writable:333:10) at Writable.write (node:internal/streams/writable:337:10) at ReadStream.ondata (node:internal/streams/readable:766:22) at ReadStream.emit (node:events:514:28) at addChunk (node:internal/streams/readable:324:12) { code: 'CSV_RECORD_INCONSISTENT_FIELDS_LENGTH', bytes: 141, comment_lines: 0, empty_lines: 0, invalid_field_length: 0, lines: 3, records: 2, columns: false, error: undefined, header: false, index: 6, raw: undefined, column: 6, quoting: false, record: [Array] } ] }
Note, tested with Node.js versions "v16.13.0", "v18.17.1", "v20.5.1" and "v21.5.0".
I replaced your file with the one I shared and it doesn't work, but when I test it with the one from the link you shared, it works.
When I check the files, I see this, meaning that your file doesn't contain a BOM?
alex@undefined:~/WebstormProjects/test$ cat src/411.csv | file -
/dev/stdin: ASCII text
alex@undefined:~/WebstormProjects/test$ cat src/test_with_bom.csv | file -
/dev/stdin: Unicode text, UTF-8 (with BOM) text
So there is a script that should reproduce an issue on your side
import assert from 'node:assert';
import { Writable } from 'node:stream';
import { parse } from 'csv-parse';
import { finished } from 'stream/promises';
import { Readable } from 'stream';
async function a() {
const errors: any = [];
const parser = parse({
bom: true,
skipRecordsWithError: true,
});
// Create a stream and consume its source
const sink = new Writable({
objectMode: true,
write: (_, __, callback) => callback(),
});
const csv = `\ufeffkey,value
1,value-1
2,value-2
3,"value-3
one more line
and one more line"`;
const csvSource = Readable.from([csv]);
const outStream = csvSource.pipe(parser).pipe(sink);
// Catch records with errors
parser.on('skip', (e) => {
errors.push(e);
});
// Wait for stream to be consumed
await finished(outStream);
// Catch error from skip event
assert.equal(errors.length, 1);
}
a().catch(console.error);
My bad, I forgot that the issue was associated with the bom. Version "5.5.6" of csv-parse fixes the issue.
Describe the bug
Setting both options results in csv file parsed correctly but no event is fired for skipped lines.
To Reproduce
I would expect the console error to be called but it doesn't. When removing the
bom
option and using something like strip-bom-stream the event is fired correctly.Additional context data-with-bom.csv