Closed schanzer closed 2 months ago
Started looking at this one, some notes so I don't forget when I come back to this:
The resolution of this will need to be over in pyret-lang
, which defines the statistics package.
The interesting bit here is that the definitions are currently typed for numbers and the implementations depend on that for efficiency, so we'll need to:
group-and-count
, which currently depends on the runtime helper raw_array_sort_nums
to an alternative that will work for other types. (We probably still want to use the fast version for numbers, though?)Related discussion over in pyret-lang, mostly around actually enforcing the type constraint and showing a nicer error, rather than extending these to other types: https://github.com/brownplt/pyret-lang/issues/1538
@asolove oh wow - really interesting to see that thread. I didn't realize this came up 2 years ago! In that case, maybe the solution is just a written-in-pyret-function that lives in our Data Science library. Would you be willing to write one?
Yeah, we could definitely do that. Can you point me to the Data Science library?
Here's the link - I'm sure I'm not doing the most elegant stuff, so any advice you have on coding quality is most welcome!
Gonna close this out as the resolution won't be in the CPO codebase. But it's still on my backlog list so I'll write some suggested changes to that file and share with you.
Closing this as dupe of https://github.com/brownplt/pyret-lang/issues/1538, since the discussion there is further along.
In the Statistics package, modes throws an internal error when used with non-numeric data. At the very least, this should be a better error! But more importantly, modes are not restricted to numbers. (This has some implications for our treatment of the topic in Bootstrap:DS -- right now we give a falsely narrow definition simply because Pyret doesn't support the full definition)