FasterXML / jackson-dataformats-text

Uber-project for (some) standard Jackson textual format backends: csv, properties, yaml (xml to be added in future)
Apache License 2.0
405 stars 149 forks source link

Support @JsonAnyGetter with CSV #243

Open MrBuddyCasino opened 3 years ago

MrBuddyCasino commented 3 years ago

It seems the @JsonAnyGetter annotation is not supported for the CSV format.

I made a runnable test case: https://github.com/MrBuddyCasino/jackson-csv-bug

It fails with com.fasterxml.jackson.dataformat.csv.CsvMappingException: Unrecognized column 'xxx': known columns: ["num"] (through reference chain: org.example.CsvTest$Entity["[anySetter]"])

cowtowncoder commented 3 years ago

Couple of quick questions:

  1. Jackson version? (I assume including latest, 2.12.1)
  2. Are columns declared with CsvSchema? (must be for columnar support)
cowtowncoder commented 3 years ago

Looking at the test case, schema is generated for the POJO. But unfortunately this cannot really generate anything for @JsonAnyGetter since that is based on type, not instance.

So basically it is not clear who such values could be supported: they cannot work without some mechanism for declaring columns that should exist -- currently the way to do that would be to manually construct CsvSchema (or modify one CSVMapper.schemaFor(...) builds). I suppose that having an annotation-based mechanism might work, but would require change/addition of an annotation for listing extra properties (that would map to columns).

MrBuddyCasino commented 3 years ago

Thanks for the reply.

If I understand the issue correctly, the problem is that the CsvSchema has to know all the possible columns in advance, so that it can write a complete header, right? Otherwise it could just decide to not be "strict" in the presence of a @JsonAnyGetter annotation and extend its schema as new columns are being discovered.

As a workaround I take the POJO generated CsvSchema and add the dynamic column keys from the additionalProperties map using the withColumnsFrom() method. This seems to work.

Since the @JsonAnySetter annotation is officially supported, it is not immediately obvious why @JsonAnyGetter wouldn't be, unless one thinks about it. Maybe this can be added to the documentation?

cowtowncoder commented 3 years ago

@MrBuddyCasino yes your understanding is correct: the schema (mapping between columnar position and property name) must exist for all values before they are written: although serialization could in theory append new columns at the end, it wouldn't be clear how deserialization side could determine these additions. If all output was always buffered it could be done by modifying schema, writing out headers, but Jackson operates incrementally and does not buffer entries (although there is buffering for a single row in case name/value pairs are written out of order).

There are probably ways to improve handling wrt @JsonAnyGetter and I'll keep the issue in mind in case I can think of something. If you (or anyone else) have further ideas feel free to add here as well.

And finally, yes, this is something where documentation improvements would be useful. Some annotations do not work as well (or at all) with all backend formats.