mechatroner / vscode_rainbow_csv

🌈Rainbow CSV - VS Code extension: Highlight CSV and TSV files in different rainbow colors to make them more readable
MIT License
426 stars 51 forks source link

Ignoring whitespaces inside double quotes #130

Closed arnisg closed 4 months ago

arnisg commented 1 year ago

I have csv files where whitespace is used as separator, but some cells are double quoted containing whitespaces. Commas and semicolons, when used as separators, are ignored inside double quotes - what about whitespaces?

mechatroner commented 1 year ago

This is currently not supported but technically should be possible to implement. This is also related to #131 I am also really curious, what is the name of the software or process that generates such files?

arnisg commented 1 year ago

It's Blue Coat ProxySG:

Sample log formats Blue Coat ProxySG supports logs in either of the following two formats: ● Access logs (Default) ● Extended Log File Format (Custom) The Audit application supports the “Extended Log File Format” (ELFF) for the Blue Coat ProxySG. The delimiter for the log fields is a blank space (\s) and the fields are sometimes wrapped in double quotes as shown in the log sample below.

Software: SGOS 5.2.6.1

Version: 1.0

Start-Date: 2014-04-16 00:41:36

Date: 2013-05-24 17:24:46

Fields: date time time-taken c-ip sc-status s-action sc-bytes cs-bytes cs-method

cs-uri-scheme cs-host cs-uri-port cs-uri-path cs-uri-query cs-username cs-auth-group s-hierarchy s-supplier-name rs(Content-Type) cs(User-Agent) sc-filter-result cs-category x-virus-id s-ip s-sitename r-ip

Remark: 0606020157 "DFWDLPBCSG01 - 172.16.111.196 - Blue Coat SG400" "155.17.111.196"

"main" 2014-04-21 06:42:28 164 155.17.4.168 200 TCP_TUNNELED 498 650 CONNECT tcp os-bo-app05-03.boldchat.com 443 / - - - DIRECT os-bo-app05-03.boldchat.c om - - OBSERVED "Technology/Internet" - 155.17.111.196 SG-HTTP-Service 63.251.34.61 2014-04-21 06:42:28 637 155.17.122.61 200 TCP_TUNNELED 7140 1552 CONNECT tcp us.adserver.yahoo.com 443 / - - - DIRECT us.adserver.yahoo.com - "Moz illa/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/34.0.1847.116 Safari/537.36" OBSERVED "Web Ads/Analytics" - 155.17.111.196 SG-HTTP-Service 98.137.170.33 2014-04-21 06:42:28 565 155.17.122.61 200 TCP_TUNNELED 5303 2201 CONNECT tcp csc.beap.bc.yahoo.com 443 / - - - DIRECT csc.beap.bc.yahoo.com - "Moz illa/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/34.0.1847.116 Safari/537.36" OBSERVED "Web Ads/Analytics" - 155.17. 111.196 SG-HTTP-Service 98.138.47.199

samschurter commented 11 months ago

I run into issues pretty regularly it seems where I'm dealing with CSV that has line breaks inside the fields. One example is Jira issues. To import them, I might have a file like this:

Summary,Reporter,Issue Type,Description,Priority
"Issue title",username,bug,"- description line 1
- description line 2
- description line 3",Medium

Jira accepts this as valid, and it seems most of the password managers like LastPass and Keeper etc. use and accept a similar format for import/export

It would be great if the linter and highlighter could recognize that as valid, even if options like Align wouldn't be able to handle it

mechatroner commented 11 months ago

@samschurter, this particular issue is about whitespaces as separators, not commas, but "Rainbow CSV" can already handle your file, you just need to use "Dynamic CSV" filetype instead of "CSV" (which doesn't handle newlines, see the README file) - for bigger files the extension would autodetect it automatically, but since your example has only 2 lines you need to switch to "Dynamic CSV" manually in the bottom right corner. Align, highlight, and even lint features will work correctly.

mechatroner commented 4 months ago

Done, starting from version 3.12 you can use Set rainbow separator command to select the whitespace separator with Excel policy in the new UI.