FasterXML / jackson-dataformats-text

Uber-project for (some) standard Jackson textual format backends: csv, properties, yaml (xml to be added in future)
Apache License 2.0
402 stars 145 forks source link

Incorrect location of CSV errors #483

Closed RafeArnold closed 3 weeks ago

RafeArnold commented 1 month ago

Currently, when a JsonProcessingException is thrown when parsing CSV with a missing closing quote, the reported location of the column where the error occurred is incorrect. For example,

import com.fasterxml.jackson.core.JsonProcessingException;
import com.fasterxml.jackson.databind.MappingIterator;
import com.fasterxml.jackson.dataformat.csv.CsvMapper;
import com.fasterxml.jackson.dataformat.csv.CsvParser;

import java.io.IOException;
import java.util.List;

public class Test {
    public static void main(String[] args) throws IOException {
        try (MappingIterator<List<String>> reader = new CsvMapper()
                .readerForListOf(String.class)
                .with(CsvParser.Feature.WRAP_AS_ARRAY)
                .readValues("name,dob\n\"an invalid string,2020-05-01")) {
            reader.readAll();
        } catch (JsonProcessingException e) {
            System.out.println("line number: " + e.getLocation().getLineNr());
            System.out.println("column number: " + e.getLocation().getColumnNr());
        }
    }
}

will output

line number: 2
column number: 68

The row number is correct, but the column number is not (there aren't even 68 characters in the entire CSV). I would expect the column number to be more like 30.

cowtowncoder commented 1 month ago

First of all, thank you for reporting this issue.

I think it should be possible to improve things: totally out of range column may be easy to fix. There are some challenges wrt fully accurate location due to way tokenization is done (it's not quite as incremental as, say, JSON decoding), but it should at least be possible to improve accuracy.

cowtowncoder commented 3 weeks ago

Looks like this was easy enough to resolve; only affects location at the end of content.

Fix goes in 2.18 for inclusion in 2.18.0.

RafeArnold commented 3 weeks ago

thanks @cowtowncoder !