The problem seems to be related to the return value pos of the function setCString() in src/runtime/local/io/utils.h. While the return value is used as the position of the next column in ReadCsvFile<Frame>::apply(), it's actually a position relative to the current column. The existing test cases don't seem to trigger this case, because they have string column either as the first column (where relative and absolute positions are the same) or last column (where there is anyway no next column) in a CSV file. The case is complicated by the fact that the position must start from 0 again if the string cell actually spans multiple lines.
This bug is currently preventing us from reading the lineorder table of the Star Schema Benchmark.
The following little DaphneDSL script reads a frame from a CSV file:
example.daphne:
data.csv:
data.csv.meta:
The script can be executed by
bin/daphne example.daphne
.Expected output:
Actual output:
Possible reason:
The problem seems to be related to the return value
pos
of the functionsetCString()
insrc/runtime/local/io/utils.h
. While the return value is used as the position of the next column inReadCsvFile<Frame>::apply()
, it's actually a position relative to the current column. The existing test cases don't seem to trigger this case, because they have string column either as the first column (where relative and absolute positions are the same) or last column (where there is anyway no next column) in a CSV file. The case is complicated by the fact that the position must start from 0 again if the string cell actually spans multiple lines.This bug is currently preventing us from reading the lineorder table of the Star Schema Benchmark.