Open alexpartsch opened 3 years ago
Interesting. Yes, sounds incorrect: Order for Records should be considered declaration order even if otherwise default is set to be alphabetic (lexicographic; ordering is case-sensitive).
The root cause is probably related to the fact that while there is no default ordering with Jackson in general, CsvMapper does change default configured to do alphabetic ordering.
I wonder if this could be easily reproduced with JSON ObjectMapper
by just enabling MapperFeature.SORT_PROPERTIES_ALPHABETICALLY
? If so it'd be easier to reproduce and probably simpler to fix.
@cowtowncoder enabling the property you've mentioned on a JSON ObjectMapper
like:
@Test
void testJsonOrder() throws JsonProcessingException {
var om = new ObjectMapper();
om.configure(MapperFeature.SORT_PROPERTIES_ALPHABETICALLY, true);
var json = """
{
"c": "C",
"b": "B",
"a": "A"
}""";
var tr = om.readValue(json, TestRecord.class);
Assertions.assertEquals(new TestRecord("C", "B", "A"), tr);
}
With the record:
public record TestRecord(String c, String b, String a) {
}
Will not reproduce the issue.
Ah. No, that should not matter quite the same way since JSON does not need name/column-position mapping.
But what I think is that alphabetic ordering should not actually change expected serialization order for Record
s.
Even then, however, mapping should not get confused so that's odd.
One thing to note however: I am not sure why you are both building schema from Record and using withHeader()
?
Only one is used, and it really has to be header row anyway. So there is not much point building it from record type.
I would expect test still pass so it seems there is a bug somewhere but thought I'd ask about usage -- maybe there is some other implied expectation?
Ok, since I did not recall likely reasons, I checked out CsvSchema
, which points that the use case of having schema AND expecting a header is left for a case where:
When the header line is present and the settings ask for it
to be processed, two different options are possible:
a) The schema has been populated. In this case, build a new
schema where the order matches the *actual* order in which
the given CSV file offers its columns, if _schema.reordersColumns()
is set to true; there cases the consumer of the CSV file
knows about the columns but not necessarily the order in
which they are defined.
Not sure if that helps explain anything but at least it confirms one possible reason to use both.
@cowtowncoder running the code without withHeader()
, like in this example:
@Test
void testRecord() throws JsonProcessingException {
CsvMapper csvMapper = new CsvMapper();
var schema = csvMapper.schemaFor(TestRecord.class);
var csv = """
c,b,a
C,B,A""";
TestRecord tr = csvMapper.readerFor(TestRecord.class).with(schema).readValue(csv);
Assertions.assertEquals(new TestRecord("C", "B", "A"), tr);
}
yields in:
expected: <TestRecord[c=C, b=B, a=A]> but was: <TestRecord[c=a, b=b, a=c]>
Expected :TestRecord[c=C, b=B, a=A]
Actual :TestRecord[c=a, b=b, a=c]
So the header row is assumed as first data row.
Consuming header row (and since it's there, using too) makes sense. But if so, you probably don't really need to construct schema? Schema is only needed to map column index to/from logical property name. I am not saying there is no bug to fix here, just trying to understand the use case and perhaps suggest a way to work around it for now.
This problem affects the writing of Java records too. I want to define the names and order of my CSV columns in a Java record, but they come out in alphabetical order unless I use @JsonPropertyOrder
.
My current workaround looks like this (use at your own risk):
public static CsvSchema schemaWithHeader(CsvMapper csvMapper, Class<?> recordClass) {
if (!recordClass.isRecord()) {
throw new RuntimeException("Must use a Java record class for defined ordering");
}
var defaultSchema = csvMapper.schemaFor(recordClass);
var schema = CsvSchema.builder();
for (RecordComponent component : recordClass.getRecordComponents()) {
String columnName = getPropertyName(component);
var columnType = defaultSchema.column(columnName).getType();
schema.addColumn(columnName, columnType);
}
return schema.build().withHeader();
}
private static String getPropertyName(RecordComponent component) {
String name = component.getName();
if (component.isAnnotationPresent(JsonProperty.class)) {
name = component.getAnnotation(JsonProperty.class).value();
} else if (component.getAccessor().isAnnotationPresent(JsonProperty.class)) {
name = component.getAccessor().getAnnotation(JsonProperty.class).value();
}
return name;
}
Yes, CsvMapper
enforces stable ordering (alphabetic) for Java objects; and this predates addition of Record type.
I am not quite sure how this should be handled: given that Records do have stable ordering to use, it would seem like that should be the default, even for CSV module. But from implementation and configuration perspective there is no way to really do that at the moment.
Quite a few changes have been made in jackson-databind 2.15.0-SNAPSHOT relating to how Records are handled. If anyone wants to try the latest snapshot jars to see if they have changed CsvMapper behaviour, that would be great.
A possible workaround is to register a custom annotation introspector following the record component order:
private static class OrderedRecordIntrospector extends JacksonAnnotationIntrospector {
@Override
public String[] findSerializationPropertyOrder(final AnnotatedClass ac) {
return ac.getRawType().isRecord() && !ac.hasAnnotation(JsonPropertyOrder.class)
? Arrays.stream(ac.getRawType().getRecordComponents())
.map(RecordComponent::getName)
.toArray(String[]::new)
: super.findSerializationPropertyOrder(ac);
}
}
Another possible workaround is to use @JsonProperty with index ;), e.g.:
public record TestRecord(
@JsonProperty(index = 0) String c,
@JsonProperty(index = 1) String b,
@JsonProperty(index = 2) String a) {}
I'm trying to use
CsvMapper
to parse a Java 16 Record out of a CSV file. While testing it I've encountered some strange behaviour:It seems
CsvMapper.schemaFor(AnyRecord.class).withHeader()
always returns a schema where the columns are assumed to be sorted alphabetically by name and the header row is not considered.Here's an example:
Running the JUnit 5 test:
Will fail with:
So, the parsed instance of
TestRecord
was initialised in the wrong order.Doing the same thing with a class:
Running the test:
Which passes fine!
Having a record where the properties are ordered alphabetically by name:
A similar test case to the first works fine (not the CSV data is still ordered c,b,a):
Also defining a schema directly will work:
While an annotated record won't again:
Using
@JsonPropertyOrder
works well!I guess the actual errors lies somewhere in the CSV deserilaizer, since alphabetical property order seems to be the default in jackson. I tried to debug into the whole process but couldn't pin point it.