conveyal / r5

Developed to power Conveyal's web-based interface for scenario planning and land-use/transport accessibility analysis, R5 is our routing engine for multimodal (transit/bike/walk/car) networks with a particular focus on public transit
https://conveyal.com/learn
MIT License
272 stars 71 forks source link

Some fields of CSV writers are not initialized while headers are written #915

Open abyrd opened 7 months ago

abyrd commented 7 months ago

This problem was encountered at https://github.com/conveyal/r5/pull/912#issuecomment-1822136240.

In Java, strangely it's possible to read the value of an uninitialized final variable in a constructor and this is not considered an error. This would often fail fast for null references but not for primitives - their "final" values can actually change during execution. Java also disallows setting such final fields before calling a superclass constructor (as superclass constructor calls must be the first line in a constructor).

We didn't hit this case before, but were trying to include a user-specified value (the dual accessibility threshold) in a column header when the CSV writer was initialized.

The problem described above is the subject of a change introduced in Java 22: https://openjdk.org/jeps/447 "statements before super()". Once that change lands we should initialize all the fields in the subclass before calling the constructor.

We also noticed that CsvResultWriter calls setDataColumns before setting the task field. Setting the task field first would allow its contents to be available for creating headers, and is just generally safer since the compiler doesn’t check for initialized values. But at worst this one would result in an NPE because it's not a primitive typed field. We didn't want to change it right before a release in case this had other implications, but it might be a good tweak as part of another changeset on initialization order.

On the other hand, maybe we just shouldn't be performing file I/O in a constructor. That doesn't seem like good form. If we removed that step, the initialization order wouldn't matter as much. But still, once Java allows it, we should avoid situations where final fields can be read before initialization.