It's because of a hack in my earlier implementation — because we can't know the number of lines in a CSV before reading it, we can't know the size of the output unless we want to double back or hold the entire thing in memory, which I think is undesirable.
Previously I just said the size was "0" sized because the size was only used to optimize runtime performance, but #160 expects the value it's given to be accurate (which is reasonable) in order to produce valid output.
Another approach might be to have size be some kind of enum to explicitly distinguish between known and unknown sizes, but it makes the api less ergonomic for all the other processors which know their size.
Without the fix, the new tests fail with invalid output:
It's because of a hack in my earlier implementation — because we can't know the number of lines in a CSV before reading it, we can't know the size of the output unless we want to double back or hold the entire thing in memory, which I think is undesirable.
Previously I just said the size was "0" sized because the size was only used to optimize runtime performance, but #160 expects the value it's given to be accurate (which is reasonable) in order to produce valid output.
Another approach might be to have
size
be some kind of enum to explicitly distinguish between known and unknown sizes, but it makes the api less ergonomic for all the other processors which know their size.