leeoniya / uDSV

A faster CSV parser in 5KB (min)
MIT License
669 stars 14 forks source link

Invalid parsing of empty numeric columns at end of line #13

Closed mattosborn closed 2 weeks ago

mattosborn commented 2 weeks ago

If the last column in a line is numeric and empty, and the linebreak is \r\n, the column will be parsed to 0 instead of null / empty.

import { inferSchema, initParser } from "udsv";

const csvStr = `
id,col1,col2
a,1,2
b,2,
c,3,\r\nd,4,\ne,5,
`.trim();

const schema = inferSchema(csvStr);
const parser = initParser(schema);

const data = parser.typedDeep(csvStr);

This will parse to:

[
  { id: 'a', col1: 1, col2: 2 },
  { id: 'b', col1: 2, col2: null },
  { id: 'c', col1: 3, col2: 0 },
  { id: 'd', col1: 4, col2: null },
  { id: 'e', col1: 5, col2: null }
]
leeoniya commented 2 weeks ago

replying from phone, so cannot test, but i see you're inter-mixing line delimiters between \n and \r\n. this is currently not supported, so could be causing the issue. can you check if this works properly with uniform line delimiters?

leeoniya commented 2 weeks ago

yes, this appears to be an issue with mixed line endings. i don't plan to support this.

with uniform line endings it works correctly: https://jsfiddle.net/ew5xcb0s/