ribbondz / rsv

A command-line tool written in Rust for analyzing CSV, TXT, and Excel files.
GNU General Public License v3.0
151 stars 9 forks source link

Excel2csv should escape commas and quote characters #10

Open andrejohansson opened 1 month ago

andrejohansson commented 1 month ago

First: Thank you for this tool!

Similar to #9 if you convert an excel document and a cell contains commas or the quote character. They should be escaped according to rfc-4180

See https://stackoverflow.com/questions/17808511/how-to-properly-escape-a-double-quote-in-csv

Right now if I convert a an excel file it becomes an invalid csv if the cells contains quotes or commas.

ribbondz commented 1 month ago

Thank you for the report.

Right now, I think the within-field comma cound be handled properly. For example, given the following Excel table:

image

Right CSV file could be generated through the command: rsv excel2csv tele.xlsx or rsv slice tele.xlsx | rsv to x.csv

image

For the within-field quotes, I will try to make it right in the following days.

andrejohansson commented 1 month ago

For my case, I made a small change here: https://github.com/ribbondz/rsv/blob/master/src/utils/writer.rs#L164

Before

Data::String(v) => {
    if v.contains(',') {
        write!(&mut self.0, "\"{}\"", v)?
    } else {
        write!(&mut self.0, "{}", v)?
    }
}

After

fn double_quotes(input: &str) -> String {
    input.replace("\"", "\"\"")
}
Data::String(v) => {
    if v.contains(',') {
        write!(&mut self.0, "\"{}\"", double_quotes(v))?
    } else {
        write!(&mut self.0, "{}", double_quotes(v))?
    }
}

This method just escapes all quotes (which is correct if you use double quotes as I do, but it should really use the quote char etc to cover more cases).