osiegmar / FastCSV

CSV library for Java that is fast, RFC-compliant and dependency-free.
https://fastcsv.org/
MIT License
542 stars 93 forks source link

Add QuoteStrategy parameter in CsvReader to handle empty strings vs null values #58

Closed OlivierLevrey closed 2 years ago

OlivierLevrey commented 2 years ago

QuoteStrategy.EMPTY is convenient if I want to differenciate empty strings from null values in the output file.

However there is no such parameter in CsvReader which means I cannot read back the original data.

Below is a unit test showing this:

    /**
     * Writes a single row of special values, reads back the file, and tests
     * that read values exactly match the original values.
     */
    @Test
    public void test() throws IOException {
        String[] values = new String[]{
            "Simple text",
            "Multiline\ntext",
            // a string containing a comma
            "1,2",
            // a string with double quotes
            "\"Hello\"",
            // a string containing a single character: a double quote
            "\"",
            // an empty string
            "",
            // a null value
            null
        };

        File tmp = new File("C:/tmp/csv.txt");

        // write the csv file
        try (CsvWriter csv = CsvWriter.builder()
            .quoteStrategy(QuoteStrategy.EMPTY)
            .build(tmp.toPath(), StandardCharsets.UTF_8)) {

            csv.writeRow(values);
        }

        // read back the file
        String[] readValues = null;
        try (CsvReader csv = CsvReader.builder()
            .skipEmptyRows(true)
            .build(tmp.toPath(), StandardCharsets.UTF_8)) {

            for (CsvRow row : csv) {
                readValues = new String[row.getFieldCount()];
                for (int i = 0; i < readValues.length; i++) {
                    readValues[i] = row.getField(i);
                }
            }
        }

        Assert.assertNotNull(readValues);
        // this fails because of the null value read back as an empty string
        Assert.assertArrayEquals(values, readValues);
    }
}

It would be very nice to have the QuoteStrategy parameter in the reader.

osiegmar commented 2 years ago

Thanks for your feedback!

The difference between reading and writing is:

When creating the API of FastCSV, I decided to design a Null-free API (see features). That way I can ensure, no one gets a NullPointerException when working with the API of this library.

Hence this design decision, I don't plan to offer a mechanism to read/return null values – not even with an optional strategy. If you really want to have nulls in your code, I'm afraid you have to add/convert them in your application code. Hope you can understand.

OlivierLevrey commented 2 years ago

OK thank you for your quick reply.