Parse CSV files! Parse TSV files! Parse PSV files?!
Yes!! Parse it all! All the DSV files!
A simple utility for reliably parsing delimiter-separated values (i.e., DSV) in AutoHotkey v2 scripts, whether that be comma-separated (i.e., CSV), tab-separated (i.e., TSV), or something else, possibly even exotic ones.
For AutoHotkey v1, check out, https://github.com/jasonsparc/DSVParser-AHK
"hello" world "foo bar"
will be
parsed as hello world "foo bar"
).
CR
, LF
, CR+LF
, LF+CR
, VT
, FF
, NEL
, RS
, GS
,
FS
, LS
, PS
Download dsvparser-ahk2.ahk
[^1] then include it in your script
(via #Include
) as its library.
[^1]: Tip: Right-click this link dsvparser-ahk2.ahk
,
then "Save link as…" or whatever is the equivalent provided by your browser.
Once you've done that, here's how you might use the library:
; Load a TSV data string
tsvStr := FileRead("data.tsv")
; Parse the TSV data string
MyTable := TSVParser.ToArray(tsvStr)
; Do something with `MyTable`
MsgBox MyTable[2][1] ; Access 1st cell of 2nd row
; ... do something else with `MyTable` ...
; Convert into a CSV, with custom line break settings
csvStr := CSVParser.FromArray(MyTable, "`n", false)
if (FileExist("new-data.csv"))
FileDelete("new-data.csv")
FileAppend(csvStr, "new-data.csv")
Both TSVParser
and CSVParser
are premade instances of the class DSVParser
.
To read and write in other formats, create a new instance of DSVParser
and
specify your desired configuration.
Here's a DSVParser
for pipe-separated values (aka., bar-separated):
global BSVParser := DSVParser("|")
Many more utility functions are provided for parsing and formatting DSV strings, including parsing just a single DSV cell.
Check out the source code! It's really just a tiny file.
Loop parse
?AutoHotkey v2 comes with Loop parse _, "CSV"
, which allows you
to quickly parse a “single line” of CSV string. However, if your string contains
several lines of text, it will still treat it as if it was a single line of CSV
string. To mitigate this problem, you may first break the string up into several
lines using a file-reading loop (either Loop read
or
Loop parse _, "`n", "`r"
), then parse each line
separately. However, that ignores the fact that a CSV cell is allowed to contain
multiple lines—Yes! All in a single CSV cell! If your CSV data is quite complex,
Loop parse
won't be able to handle such cases.
For the initial motivation regarding the creation of this library, see the forum post: “[Library] DSV Parser - AutoHotkey Community”
P.S. This library can even be used to parse a CSV inside a CSV, inside a CSV, inside a CSV, inside a…—whatever “RFC 4180” allows.