Closed toddfarmer closed 5 years ago
Note: Comment by Emilio Lahr-Vivaz (elahrvivaz): Note: I don't think it's directly supported by the Writer API, so you have to do it slightly manually. Here is our code for writing dictionary values encoded children in a map vector (FYI there is some non-relevant bits that involve our integration with geotools, and it's in scala):
https://github.com/locationtech/geomesa/blob/master/geomesa-arrow/geomesa-arrow-gt/src/main/scala/org/locationtech/geomesa/arrow/vector/ArrowAttributeWriter.scala#L189-L191 https://github.com/locationtech/geomesa/blob/master/geomesa-arrow/geomesa-arrow-gt/src/main/scala/org/locationtech/geomesa/arrow/vector/ArrowAttributeWriter.scala#L224-L225
https://github.com/locationtech/geomesa/blob/master/geomesa-arrow/geomesa-arrow-gt/src/main/scala/org/locationtech/geomesa/arrow/vector/ArrowAttributeReader.scala#L148 https://github.com/locationtech/geomesa/blob/master/geomesa-arrow/geomesa-arrow-gt/src/main/scala/org/locationtech/geomesa/arrow/vector/ArrowAttributeReader.scala#L164-L167
Note: Comment by Emilio Lahr-Vivaz (elahrvivaz): Some more details - the important bit is calling 'vector.addOrGet' with the correct dictionary encoded field. This sets up the metadata correctly. The child vector is of the dictionary encoded type (e.g. Int), and you have to manually encode dictionary values before writing them. On read, you have to examine the schema so that you know to manually decode the Int values appropriately.
Note: Comment by Wes McKinney (wesm): Cool, after we get the integration tests working for flat data I will add integration tests for encoded subfields and we can address any issues that come up
Note: Comment by Wes McKinney (wesm):
Moving this off 0.6.0 as not urgent. It would be good to have unit tests for this, though. cc [~icexelloss]
Note: Comment by Wes McKinney (wesm): [~emkornfield@gmail.com] I might put this on your radar for some point in the future, moving out of 0.14.0 for now
Note: Comment by Ji Liu (tianchen92): If [~emkornfield@gmail.com] don't mind, I would like to take this issue:).
Note: Comment by Micah Kornfield (emkornfield@gmail.com): SGTM, in general, I think if issues are unassigned they are up for anyone to take on.
Note: This issue was originally created as ARROW-1175. Please see the migration documentation for further details.
Original Issue Description:
We do not have any tests about types like:
cc
[~julienledem]
[~elahrvivaz]