Inverse map functions - Githubissues

austindd commented 4 years ago

This PR adds two functions based on inverseMap from Haskell's relude library.

The idea is, given an ordered bounded enum t and a function that maps each t value to a unique a value, we can derive the inverse mapping function from that particular set of a values to their corresponding t values. Obviously, we can't account for all a values, so inverseMap must return option(t).

This is particularly nice if you already have a show function, and want to generate the inverse function to parse strings to that data type.

Note:

Functions generated using inverseMap will never be as efficient as their hand-written counterparts. This is because they need to perform dynamic lookups to get a return value. Even worse, the inverse of t => a has O(n) worst-case time complexity if we don't have a comparator function for a (we are forced to use an associated list/array for lookups). Therefore, I made a second function, inverseMapWithComparator, which takes a compare function for a and uses a Map.t(a, t) for the internal lookup table, which should be roughly O(log(n)).

I also included some tests for these functions.

mlms13 commented 4 years ago

I haven't jumped in to do a full review yet, but a couple thoughts:

Awesome! This seems like a very useful thing, and "exists in the Haskell world" is always a great starting point 😁
I'm not sure how I feel about the === check in inverseMap. In most cases, if eq/ord is needed, we just require an eq/ord instance or function, as you've done in the withComparator variation

I assume requiring Ord is a bit of a burden on the user of the function, though it allows us to be more efficient in the lookup. I think that efficiency makes the requirement worth passing on to the caller, but I'm open to counterpoints. I'd tentatively lean toward:

  let inverseMapBy: 'a. ('a, 'a) => ordering, E.t => 'a, 'a) => option(E.t);
  let inverseMap: (type a, ordA: (module ORD with type t = a), E.t => a, a) => option(E.t)

Maybe got the syntax wrong, but the default version would take a module and a version suffixed with ...By would take a simple compare function. If we're worried about imposing an ORD constraint on the caller, we could maybe provide an EQ variation as well, but I'd prefer to use the most efficient as the default and avoid === if possible. Definitely open to other thoughts though.

austindd commented 4 years ago

@mlms13 Thanks for the feedback!

I'm not sure how I feel about the === check in inverseMap. In most cases, if eq/ord is needed, we just require an eq/ord instance or function, as you've done in the withComparator variation.

I totally agree. That was actually just an oversight on my part. If we are extending EQ and ORD for an instance, the user should expect the algorithm to use the precise eq and compare functions that were provided for that instance.

I assume requiring Ord is a bit of a burden on the user of the function, though it allows us to be more efficient in the lookup . . . If we're worried about imposing an ORD constraint on the caller, we could maybe provide an EQ variation as well.

Totally, I think that's a great idea. I got halfway through these implementations and realized that we could maybe parameterize ALL of the typeclass-specific functions in the function signature, and then just include a partially applied version of it for each typeclass extension functor. The partial application won't even be a performance concern, either, since it's mostly used for staging before getting back an efficient unary function.

If we really want fast, specialized versions for string/int/char, we can do that as well. I think the most common use case will be generating string parsers, so I think there should at least be a special string version.

austindd commented 4 years ago

@mlms13 If we want to keep the version of inverseMap that requires no additional functions on a (like equal or compare), then would we just use Pervasives.(==) for equality checking? It's a bit of a shaky solution IMO, but we could always document that it uses polymorphic compare, right?

andywhite37 commented 4 years ago

I would probably say to not use Pervasives.(==) anywhere in relude, and if the caller wants to use it for their Ord/compare function, they are welcome to. We don't use == anywhere else (I hope), and I think it's good to discourage the use of that - or at least make the caller decide.

austindd commented 4 years ago

Cool, I probably wouldn't want polymorphic comparison in my own projects, especially when it's silently introduced via library code... And it's minimal overhead for the user to supply an equality function.

austindd commented 4 years ago

@mlms13 @andywhite37 @johnhaley81 I just updated the PR. I changed the function names and addressed most of the issues discussed here. Also cleaned up the code. Let me know what you think.

andywhite37 commented 4 years ago

@austindd - I don't want to hold up your progress on this since it's working now. If you would like, we can just merge this as-is, and if it makes sense to do any follow-on work, that's cool. It's not that big a deal either way.

austindd commented 4 years ago

@andywhite37, Sorry, I meant to tackle this earlier, but just got extremely busy recently. I actually do want to finish it up in the next few days or so, but if you would rather merge it now, then that’s cool too.

andywhite37 commented 4 years ago

I don't need to merge it now - I just didn't want you to feel discouraged. I'm happy to wait for whatever additional changes you want to make.

austindd commented 4 years ago

@andywhite37 @mlms13 @johnhaley81 I just pushed some changes to the repo. It's not quite ready to merge. Just wanted to get your input on some alternative implementations of inverseMapEqBy and inverseMapOrdBy, which reuse some existing functions from EnumExtensions. These new versions are named inverseMapEqBy2 and inverseMapOrdBy2. The old versions are left in for comparison.

My thoughts:

Reusing functions from EnumExtensions means that we must apply the EnumExtensions functor inside BoundedEnumExtensions so we can access the resulting functions. I'm not sure if this is desirable, because it wasn't included previously. I guess it depends on how modular you want it to be.
This does reduce the Reason code size by a small amount. The JS output is reduced a bit more, but including other extension modules might significantly increase the JS bundle size, depending on the bundler, so take it with a grain of salt.
As @andywhite37 said, this does reduce the cognitive load, at least for inverseMapEqBy, but I'm not convinced it's that much simpler for inverseMapOrdBy. Also, the other functions are already implemented with code reuse in mind.

johnhaley81 commented 4 years ago

Personally, I find the V2 implementations easier to follow and thought that the data structures more properly modeled what was being done. I also like that it was reusing the stuff form EnumExtensions. I don't think that O(n) performance here is really something to worry about since I would imagine that most enum options would be pretty small relatively speaking.

I'd vote for the V2 implementations.

andywhite37 commented 3 years ago

I checked out @austindd 's branch and did a little follow-on work to get these useful functions merged! Thanks @austindd for getting the ball rolling on these. Feel free to PR further changes to them if I didn't capture the intent, or didn't implement as you intended - the tests still seemed to pass, so I think it's at least correct in that sense.

austindd commented 3 years ago

@andywhite37 I appreciate the help in getting it finished. I apologize for letting this sit here. Just have a lot going on in my life right now (mostly good things), so I haven't had much time for open-source. The functions look good to me. I'm happy with it!

andywhite37 commented 3 years ago

No worries! I had kind of forgotten about it, and we got another PR in and I remembered, so I figured I'd try to just get it in there while I was back in the code. Thanks - I like that function, it's not something I was aware of, but that's a nice utility.

reazen / relude

Inverse map functions #278