Closed thareh closed 1 year ago
Interesting, what results do you get?
Did you remember to set the options delimiter to ;
? - the default is comma.
Otherwise, the answer is always 1 column.
The result I get is only 1 column, and yes I did set the delimiter correctly - adding any kind of text to the second column and it works as expected.
Thanks!
Ah, I see. Does the file end at the end of the second line? (i.e. there's no new-line?) eg.
foo;bar
1;<EOF>
as opposed to
foo;bar
1;
<EOF>
That does appear to give a different result for me too.
Ah yes it does, but the data provided is for demonstration purposes. In the real world data where I stumbled upon this issue the row ends with a line ending and not EOF.
I can dig a bit deeper tomorrow and get back with something more.
Thanks!
I can make it always return "at least" header.ColumnCount(), if that will help.
Anyway, you can always ask for the value for column "bar". It will simply return Null if there isn't a column for that row.
Each TCsvCol
has Method ColumnCount:Int()
- which should equal to the actually set amount of "filled columns". Columns missing should be seen as nan/null.
I would not expect the CSV-module to "fill in" information which the original file did not contain. (What I mean is, that "ColumnCount() should not be somehow returning higher counts than actually found in the csv-data)
Maybe @thareh should provide an "complete" example and what he expects bmx/the module to spit out.
The thing is, I'm comparing the header column count to the row column count to check if the file is valid. But I can't seem to reproduce the issue now - perhaps I was mistaken so I do beg your pardon.
However, I've stumbled upon another issue:
Using the real world data from data.7z the reader seems to parse the last row of the file "full.csv" in the wrong manner. I copied the header and the same line to another file "stripped.csv" and then it works fine. So I'm guessing it's an issue with ZSV that it doesn't reset properly between rows or similar?
Framework BRL.Blitz
Import BRL.FileSystem
Import BRL.StandardIO
Import BRL.StringBuilder
Import Text.CSV
Function Debug:String(row:TCSVRow)
Local sb:TStringBuilder = New TStringBuilder()
For Local i:Int = 0 Until row.ColumnCount()
Local col:SCsvColumn = row.GetColumn(i)
Local header:TCsvHeader = row.GetHeader()
sb.AppendLine(header.GetHeader(i) + ": '" + col.GetValue() + "'")
Next
Return sb.ToString()
EndFunction
Local opts:TCsvOptions = New TCsvOptions()
opts.delimiter = ";"
Local file:TStream = ReadFile("full.csv")
Local csv:TCsvParser = TCsvParser.Parse(file, opts)
Repeat
Local status:ECSVStatus = csv.NextRow()
If status <> ECsvStatus.row
Exit
EndIf
Local row:TCSVRow = csv.GetRow()
If Not row
Continue
EndIf
Local header:TCsvHeader = row.GetHeader()
If header And header.ColumnCount() <> row.ColumnCount()
Local sb:TStringBuilder = New TStringBuilder()
sb.AppendLine("ERROR: Header column count mismatch")
sb.AppendLine(header.ColumnCount() + " <> " + row.ColumnCount())
sb.AppendNewLine()
For Local i:Int = 0 Until row.ColumnCount()
sb.AppendLine(row.GetColumn(i).GetValue())
Next
Print sb.ToString()
End
EndIf
' Print Debug(row)
Forever
Thanks!
Change line-endings of full.csv to "CR/LF" (Windows-style) and you won't see the message. Keep it at the used "LF" (Unix-style) and the message is there.
playing a bit with "full.csv" (removing single characters in entries until it suddenly "works") shows that it is not a specific char bugging out handling.
""
)This sounds to me as if the "line length" is somehow an issue.
Changing the CSV-File from "UTF-8" to "UTF-7" makes it run when it normally would spit out the warning. So maybe it stumbles over the utf8-encoding and assumes wrong string lengths?
Tried to enforce utf8-reading: Local file:TStream = ReadFile("utf8::stripped.csv")
but this errors out with Malformed line terminator
Thank you for your thorough investigation @GWRon!
I tried changing the line-endings to CR/LF and it did get past the previous error, but later in the file the same thing happens. (I truncated the full.csv file you have for convenience)
How did you go about to convert to UTF-7? I'd like to try it out myself.
Thanks!
I simply changed it after opening it in Geany (the texteditor I use in my Linux Mint xfce).
Yet I think this might all be a red herring. Just stuff leading to a faulty piece of code to fail - without properly indicating which piece of code it is.
Hey,
Using the following CSV-data:
Should in my world produce a row with 2 columns foo=1 and bar=NULL, but that does not seem to be the case - is this an issue with ZSV?
Thanks!