CategoricalData / hydra

Transformations transformed
Apache License 2.0
64 stars 9 forks source link

Explore compact labeled records and variants #122

Open joshsh opened 4 months ago

joshsh commented 4 months ago

Hydra currently distinguishes between product types, which apply to unlabeled / integer-indexed tuples, and record types, which apply to records with labeled fields. Similarly, Hydra has sum types and also union types. For many purposes (e.g. inference), the unlabeled product types are simpler and more efficient to work with, and of course the data representation of tuples is more compact than that of records. In principle, all records could be represented as tuples, but manipulated as records with the help of a dictionary which mediates between indexes and field names.

Explore this once the code base has settled a bit (i.e. once #118 is merged).

wisnesky commented 4 months ago

my intuition is that you'll want to get rid of products/sums and only use records/unions.

joshsh commented 4 months ago

"Get rid of" may be too strong a phrase. We still need tuples in the language for the same kinds of use cases which tuples serve in Haskell. I would say that both record types and product types are necessary, but that in many cases, we can replace record instances with tuple instances.

joshsh commented 4 months ago

Note that for projections, injections, and also for case statements (if a default branch is supported), we still do need to provide integer-valued field indexes, but not necessarily field names. It is the dictionary which lets us map between names and numbers.