Closed ysangkok closed 3 years ago
I'm generally supportive of giving csv-conduit an order preserving map mode. Before we proceed, however, I wonder if we should instead implement instances for both containers
and ordered-containers
style maps. We can keep MapRow as is and define a new OrderedMapRow
type synonym.
I considered keeping the behavior as is and just having this be an additive thing. But the only reason I think would be if somebody was relying on the ascending ordering and I just have a hard time seeing a use case for it. We don't save on any dependencies since we'd have to depend on ordered-containers anyway. Also if someone does want ascending order they just need to sort their keys first before creating their NamedRecord.
Yeah, my concern wasn't on dependencies given ordered-containers
seems very light in its deps overall. I actually do have some use cases that rely on the implicit ordering produced via the current MapRow construct - admittedly, that's a little lazy in relying on incidental behavior from Map. Further, existing workflows that produce a CSV and then load it into a database (for example) would get hurt by this as the ordering would change all of a sudden. It feels like given how ordered-containers
already depends on containers
(so no dependency saved by eliminating it), it'd be nice to give the users instances from both and not limit optionality.
Taking a look at OMap, by the way, it seems to have some efficiency difference too. It basically stores 2 maps inside containing the same information - underlying data would be shared, but I can see it being a constant factor less efficient if/when performance matters: https://hackage.haskell.org/package/ordered-containers-0.2.2/docs/src/Data.Map.Ordered.Internal.html#OMap
Fair enough. @ysangkok could you update your MR so that we keep MapRow but add an OrderedMapRow and any corresponding instances so that the user can opt in?
@MichaelXavier All right, I have updated it and tested.
Thanks! This is released in csv-conduit-0.7.3.0
Fixes #40