Open MichaelCurrin opened 7 years ago
New line characters can be replaced in SQL
replace(message, X'0A', '--')
Consider as well \n vs \r and replacing individually (two replaces). If I know the hex code for the other character.
Alternate syntax:
you can now write CHAR(13) for \r or CHAR(10) for \n, which will work whether your database is encoded in UTF-8 or UTF-16.
sqlite> select CHAR(10) || 'a';
a
===
On column types.
There is CAST(column AS TEXT)
syntax in SQLite3.
Or use column || ''
, but it still prints a number straight to CSV. Adding letters doesn't add quoting either - only a space or maybe some special characters. If I try add in single or double quotes, those are kept and characters and quoting is added around the whole string to keep them.
Therefore writing to CSV with python and setting quote level to all is a solution.
Otherwise, write out ID numeric and ID string column (with "ID" prefix in values) so one can use the text for working in Excel. The IDs are used for creating a URL (this can be done in SQLObject or query) and GUID with consistent prefix is fine for a network graph, so actually the "ID" prefix is fine.
Currently when doing a SQL query and redirecting to CSV file with -csv option, numeric IDs are not quoted.
Also, although multi-line strings are quoted and handled by CSV importer program, Google Data Studio does not like the line breaks. These were replaced in Sublime using '\n(?!ID)' -> '' to ignore actual start of new lines but this is a hack and only works if the first column value starts with ID.
Queries might be handled better in SQLObject for replacing values, quoting, @ symbol and anything else already built in (tweet URL or large image) which are not so easy in SQL or exported CSV.
It's is only a join of two tables so is simple enough to do in SQLObject. But then printing to sdtout with CSV formatting (csv module writing to stdout instead of filename?) or writing a CSV to given file location might be needed.