Open wvanbergen opened 5 years ago
It feels like what would help would be being able to put extra restrictions on a
in struct Resultset(records: List[a])
. Like if you could say that a
is always of a tuple of Ints and Strings. Then you could have addField(resultset: Resultset[a], f: a -> b): Resultset[Tuple[b, *a]]
I think type classes would help?
Thanks for the issue. The issues of dealing with generic programming on statically typed records are real. This is a known pain point in Haskell and Scala.
To make the first approach a bit more palatable, I would note you could introduce some helpers:
enum Value:
IntValue(i: Int)
StringValue(s: String)
struct Resultset(records: List[List[Option[Value]]])
def str(s): Some(StringValue(s))
def int(i): Some(IntValue(i))
resultset = Resultset([
[str("foo"), int(123), None],
[str("bar"), int(456), int(789)]
])
You still have lost any guarantees about the type, and you are basically back to dynamic types with any function working on this.
@snoble you can add constraints like you want: proof that the type is either Int or String
struct Proof(convert: forall f. f[a] -> f[b])
#here's a nice puzzle:
def flip(prf: Proof[a, b]) -> Proof[b, a]: ???
# Value has one free type parameter: a
enum Value:
IntV(v: Int, prf: Proof[Int, a])
StringV(s: String, prf: Proof[String, a])
# here you know a == String, so we can make the proof, it is just identity
def str(s: String) -> Value[String]:
StringV(s, Proof(\s -> s))
# here we see a == Int, so we can make the proof, it is just identity.
def int(i: Int) -> Value[Int]:
IntV(i, Proof(\s -> s))
so, now you can work with Value[a]
and convert to and from Int or String depending on the branch, and then get back to a
.
I don't know exactly what you want to do, but that might be helpful....
Now.... onto what could be done to make it easier to program generically...
In Purescript has a notion of row-polymorphism. So, you can write functions that work on any struct that has at least some fields with some types:
def foo(rec: { age: Int, name: String | r }) -> String: ...
so, you can call foo for all types that have an age
and name
fields.
You can do this by hacking your own requirement into a struct:
struct Requirements(getAge: r -> Int, getName: r -> String)
def foo(hasReqs: Requirements[r], rec: r) -> String:
Then you specialize as needed:
struct FooRec(age: Int, name: String)
# using the syntax I will add in #188
fooRecIsOk = Requirements(\Foo(a, _) -> a, \Foo(_, n) -> n)
# now we have a function r -> String
fooFn = foo(fooRecIsOkay)
We could do the same with bar:
struct Bar(ageWithAnotherName: Int, serial: Int, nm: String)
barIsOk = Requirements(\Bar(a, _, _) -> a, \Bar(_, _, n) -> n)
fooBar = foo(barIsOkay)
That's a bit manual to encode, but the type system should be strong enough to keep everything going.
So, to put it together, you would have something like:
struct ResultSet(items: List[a], itemIsOkay: Requirements[a])
where you have a list of a, and Requirements is proof that each a has the properties you want.
This is "dictionary passing" encoding of typeclasses @snoble
Does any of this help or is it at all interesting at least? ;) I really appreciate you taking the time to share these concerns and I do hope we can find some nice (or at least nicer) solutions together.
@wvanbergen BTW: you bring up a good point about the default JSON encoding being a bit of a pain if you really wanted to encode: enum Value: StrV(s: String), IntV(i: Int)
to just be a json value without being in a map.
Sometimes you want a struct (or enum) to be like an object, but sometimes you want basically a type-alias that is completely eliminated at runtime. Indeed in rust, 1 element structs are erased at compile time. We could do something here but give a magic value to undo it:
# serialized at a JSON number
struct Foo(i: Int)
# serialized as a record
struct Foo(intValue: Int, phantom: PhantomJson)
def foo(i: Int) -> Foo:
Foo(i, PhantomJson)
where struct PhantomJson
is just a Unit
-like type in Predef that is ignored in JSON object serialization (the key is removed), so in this case it would force Foo to be an object, but that object would have only one key.
It's not amazing...
The only out I see is maybe we could add Json support for:
enum Json:
JNull, JTrue, JFalse, JInt(n: Int), JStr(s: String), JList(items: List[Json]), JDict(d: Dict[String, Json])
(myFoo: Foo) = ...
def fooFn(f: Foo) -> Json:
# now when we see a Tuple2 of (a, a -> Json) it is serialized to Json by applying the fn.
myJson = (myFoo, fooFn)
I'm trying to sort out the example involving struct Proof(convert: forall f. f[a] -> f[b])
. I think I'm struggling a bit because I don't really understand the forall
syntax. I'm thinking that this signature says that convert
must be able to take any type that has one type variable a
and convert it to the same type but of b
. So if convert
were passed List[a]
it must produce List[b]
, and if it were passed Option[a]
it must produce Option[b]
. But that doesn't seem quite right (I can't gather how this would be possible anytime a \= b
)
This is interesting. And I think it's getting me closer to see how to do this. But let me write out what I think would be our dream syntax and then see how close we can get to that.
I think we want to be able to create a rs: Resultset[(String, String, Int), (String, String, String)]
. Somehow you'd want to enforce that the parameters for Resultset had the exact same length and the second parameters was a struct of just String
s (and maybe the first was just Strings and Ints)
remember, I'm dreaming here
so you'd make rs = Resultset([("bill", "wither"s, 39), ("agnes", "smith", 28)], ("First Name", "Last Name", "Age"))
Then you could
rs2 = rs.append(("moxy", "fruvus", 77))
and
rs3: Resultset[(String, String, Int, Int), (String, String, String, String)] = rs2.addField(\(first, last, age) -> (2019.minus(age), "Year of birth"))
(still dreaming here)
and then possibly something like
rs4: Resultset[(String, Int), (String, String)] = rs3(\(firstNameField, lastNameField, ageField, yobField) -> (firstNameField, yobField))
And then, we'd probably want to think about ways to access the fields by name as well so it's not so important to keep track of tuple orderings.
I was thinking something like Resultset[(String, String, Int, Int), (Field, Field, Field, Field)]
Where struct Field[a,b](name: String, accessor: a -> b)
where a
would be (String, String, Int, Int)
for all the Fields above and b
would line up with the types in the first tuple.
Does that make any sense?
this may be less mysterious as I write this out. It's really that tuple of Fields that I need to keep track of.
So we've started to get something sort of cool.
package Example/Resuitset
struct Resultset(data, fields)
struct Field(name: String, accessor: a -> b)
struct Row(firstName: String, lastName: String, age: Int)
def firstName(row: Row) -> String:
Row(result, _, _) = row
result
def lastName(row: Row) -> String:
Row(_, result, _) = row
result
def age(row: Row) -> Int:
Row(_, _, result) = row
result
fields = (Field("first", firstName),
Some((Field("last", lastName),
Some((Field("age", age),
None)))))
rs = Resultset([Row("a", "b", 1), Row("c", "d", 2)], fields)
def applyFields(fields, row):
recur fields:
(field, Some(s)):
Field(_, fn) = field
(fn(row), Some(applyFields(s, row)))
(field, None):
Field(_, fn) = field
(fn(row), None)
def materialize(rs):
Resultset(rows, fields) = rs
rows.map_List(\row -> applyFields(fields, row))
def addField(resultset, name: String, fn: a -> b):
Resultset(old_data, old_fields) = resultset
new_fields = (Field(name, fn), Some(old_fields))
Resultset(old_data, new_fields)
def filterRecords(records, fn: a -> Bool):
records.flat_map_List(\row ->
keep = fn(row)
match keep:
True: [row]
False: [])
def filter(rs, fn: a -> Bool):
Resultset(old_records, same_fields) = rs
new_records = filterRecords(old_records, fn)
Resultset(new_records, same_fields)
def new_age(row):
row.age.add(10)
new_resultset = rs.addField("newAge", new_age)
filtered_resultset = new_resultset.filter(\row -> eq_Int(row.new_age, 11))
out = materialize(filtered_resultset)
The only problem is that a we'd really like these accessor def
s to enter some namespace without all the def
s. We originally thought we wanted some syntax for struct accessors but now we're not so sure.
PS as an aside, this evaluates but there's it refuses to encode to json, which is surprising
if you're curious of the error
> core/run write-json --input ../bosatsu_examples/resultset.bosatsu --main Example/Resultset --output ../bosastsu_examples/resultset.out
[info] Running org.bykn.bosatsu.Main write-json --input ../bosatsu_examples/resultset.bosatsu --main Example/Resultset --output ../bosastsu_examples/resultset.out
cannot convert type to Json: TyApply(TyConst(Defined(PackageName(NonEmptyList(Bosatsu, Predef)),TypeName(Constructor(List)))),TyApply(TyApply(TyConst(Defined(PackageName(NonEmptyList(Bosatsu, Predef)),TypeName(Constructor(Tuple2)))),TyConst(Defined(PackageName(NonEmptyList(Bosatsu, Predef)),TypeName(Constructor(Int))))),TyApply(TyApply(TyConst(Defined(PackageName(NonEmptyList(Bosatsu, Predef)),TypeName(Constructor(Tuple2)))),TyApply(TyConst(Defined(PackageName(NonEmptyList(Bosatsu, Predef)),TypeName(Constructor(Option)))),TyMeta(Me
ta(44,AllocRef(java.lang.Object@6da5db5b,None))))),TyConst(Defined(PackageName(NonEmptyList(Bosatsu, Predef)),TypeName(Constructor(Unit)))))))
I'll try to address both these issues as my top Bosatsu priority.
Does that mean the materialize
function will become illegal? That would be too bad
no, it shouldn't.... I think it is just an issue in generalization... meta vars that aren't fixed by inference are usually free: should be forall a. ...
Question: why not use for comprehensions in your code? You should be able to use that instead of map_List. Also there is a flattening version:
[ *stuff(x) for x in y if g(x) ]
to flatten stuff if g(x) if true...
Don't have to use it, just want to make sure you didn't see any issues with it.
I think it's just a matter of us not being used to them
The return type of this code is:
Bosatsu/Predef::List[(Bosatsu/Predef::Int, Bosatsu/Predef::Option[?a])]
So. That much easier to read. The ?a
means it is some variable that was not solved for. That's a bug. We should either solve or fail to type check.
I'm suspecting this function currently:
def applyFields(fields, row):
recur fields:
(field, Some(s)):
Field(_, fn) = field
(fn(row), Some(applyFields(s, row)))
(field, None):
Field(_, fn) = field
(fn(row), None)
What's the type?
fields: (Field[a, b], Option[?a])
but then you call applyFields
with ?a
so, (Field[a, b], Option[?a]) = ?a
This is not a finite type... A fixed point type can encode, it, since, as you see, the value is finite.
I'll dig more.
oh yes... I can get it to runtime error now. I was expecting a little much wanting this to type check
here's how it blows up
mat = materialize(filtered_resultset)
def extract(row):
(_, s) = row
match s:
Some(s):
(x, _) = s
Some(x.add(5))
None:
None
out = mat.map_List(\row -> extract(row))
> core/run eval --input ../bosatsu_examples/resultset.bosatsu --main Example/Resultset
[info] Running org.bykn.bosatsu.Main eval --input ../bosatsu_examples/resultset.bosatsu --main Example/Resultset
[error] (run-main-b) java.lang.reflect.InvocationTargetException
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.bykn.bosatsu.FfiCall.invoke$1(Externals.scala:44)
at org.bykn.bosatsu.FfiCall.$anonfun$call$5(Externals.scala:41)
at cats.Eval.$anonfun$map$1(Eval.scala:59)
at cats.Eval$.loop$1(Eval.scala:351)
at cats.Eval$.cats$Eval$$evaluate(Eval.scala:372)
at cats.Eval$FlatMap.value(Eval.scala:308)
at org.bykn.bosatsu.MainCommand$Evaluate.$anonfun$run$1(Main.scala:109)
at org.bykn.bosatsu.MainResult$Success.flatMap(Main.scala:60)
at org.bykn.bosatsu.MainCommand$Evaluate.run(Main.scala:104)
at org.bykn.bosatsu.Main$.main(Main.scala:283)
at org.bykn.bosatsu.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
Caused by: java.lang.RuntimeException: expected integer: ExternalValue(a)
at scala.sys.package$.error(package.scala:26)
at org.bykn.bosatsu.PredefImpl$.i(Predef.scala:131)
at org.bykn.bosatsu.PredefImpl$.add(Predef.scala:135)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.bykn.bosatsu.FfiCall.invoke$1(Externals.scala:44)
at org.bykn.bosatsu.FfiCall.$anonfun$call$5(Externals.scala:41)
at cats.Eval.$anonfun$map$1(Eval.scala:59)
at cats.Eval$.loop$1(Eval.scala:351)
at cats.Eval$.cats$Eval$$evaluate(Eval.scala:372)
at cats.Eval$FlatMap.value(Eval.scala:308)
at org.bykn.bosatsu.MainCommand$Evaluate.$anonfun$run$1(Main.scala:109)
at org.bykn.bosatsu.MainResult$Success.flatMap(Main.scala:60)
at org.bykn.bosatsu.MainCommand$Evaluate.run(Main.scala:104)
at org.bykn.bosatsu.Main$.main(Main.scala:283)
at org.bykn.bosatsu.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
I found another approach, but I hit a bit of a hitch that might kill it. The idea was to create a Structure
struct that would allow you to deconstruct and reconstruct a tuple as useful. To start it would have a cons
, car
, and cdr
so that cons could cons((1, "two", 3), cons(("a", "b", "c"), ((), (), ()))
would come out as
((1, "a"), ("two", "b"), (3, "c"))
. But I'm running into trouble where cons
doesn't end up having a quantified type so doesn't like being called like this. I think this might be solvable still though. Here's my code so far
package Example/Resultset2
struct Field(name: String)
struct Structure(cons, car, cdr)
struct Resultset(data: List[a], field, emptyRow, structure)
unitStructure = Structure(\x -> \y -> (), \x -> (), \x -> ())
valueCons = \x -> \y -> Tuple2(x,y)
def valueCar(xy):
Tuple2(x,y) = xy
x
def valueCdr(xy):
Tuple2(x,y) = xy
y
valueStructure = Structure(valueCons, valueCar, valueCdr)
def consStruct(strucL, strucR):
Structure(consL, carL, cdrL) = strucL
Structure(consR, carR, cdrR) = strucR
def cons(inst1, inst2):
Tuple2(left1, right1) = inst1
Tuple2(left2, right2) = inst2
Tuple2(consL(left1, left2), consR(right1, right2))
def car(inst):
Tuple2(left, right) = inst
Tuple2(carL(left), carR(right))
def cdr(inst):
Tuple2(left, right) = inst
Tuple2(cdrL(left), cdrR(right))
Structure(cons, car, cdr)
fields = (Field("first name"), Field("last name"), Field("age"))
data = [("bill", "withers", 50), ("Aretha", "Franklin", 70)]
emptyRow = (((), (), ()))
structure = valueStructure.consStruct(valueStructure.consStruct(valueStructure.consStruct(unitStructure)))
def dbQuery(query):
Resultset(data, fields, emptyRow, structure)
def addField(rs, field: Field, fn):
Resultset(data, fields, emptyRow, structure) = rs
newStructure = valueStructure.consStruct(structure)
newFields = Tuple2(field, fields)
Structure(cons, _, _) = structure
newData = data.map_List(\row -> (fn(fields.cons(row.cons(emptyRow))), row))
newEmptyRow = Tuple2((), emptyRow)
Resultset(newData, newFields, newEmptyRow, newStructure)
rs = dbQuery("...")
def mapRow(fieldsAndRow):
((_, _), (_, _), (_, age)) = fieldsAndRow
age.add(5)
rs2 = addField(rs, Field("age plus 5"), mapRow)
Resultset(outData, outFields, _, _) = rs2
out = (outData, outFields)
I think if I make my own MyTuple2
that has its own cons
, car
, cdr
functions that might get me somewhere
I think unsurprisingly I'm barking up the same tree as before
(I've just gotten in the habit of adding my notes and progress on this here in case one of y'all see and have insight. If this is a bad use of this space I'll find somewhere else)
I thought I could make it work if I started making univariate structures instead. But I'm getting a weird type error instead
package Example/Resultset2
struct Structure(cons: forall a. forall b. t[a] -> t[b] -> t[Tuple2[a,b]], car: forall a. forall b. t[Tuple2[a,b]] -> t[a], cdr: forall a. forall b. t[Tuple2[a,b]] -> t[b])
struct Id(x)
def valueCons(idX, idY):
Id(x) = idX
Id(y) = idY
Id((x,y))
def valueCar(xy):
Id(Tuple2(x,y)) = xy
Id(x)
def valueCdr(xy):
Id(Tuple2(x,y)) = xy
Id(y)
valueStructure = Structure(valueCons, valueCar, valueCdr)
results in
[info] Running org.bykn.bosatsu.Main eval --input ../bosatsu_examples/resultset2.bosatsu --main Example/Resultset2
in file: ../bosatsu_examples/resultset2.bosatsu, package Example/Resultset2, type (?a,) does not unify with type $b24
18| Id(y)
19|
20|valueStructure = Structure(valueCons, valueCar, valueCdr)
^^^^^^^^^
Confusingly Tuple2(a, b)
isn’t (a, b)
logical tuples are made list cons cells from Tuple2 and Unit. So (a, b) = Tuple2(a, Tuple2(b, Unit))
I think this will compile if you make that change. I think you should avoid Tuple2 if possible and use literal logical tuples which are supported in types and values.
Oh shoot, I can't believe I missed that. I switched away from tuples because I wanted everything to be a string so I got rid of the Units with a plan to put them back later. Thank you!
Also forall a. forall b.
is the same as forall a, b.
if you want to shorten it a bit.
Ah ha, I couldn't get quite the right combinations of commas and periods.
Something I couldn't figure is how to make a Structure[Unit]
. I would want cons = \x -> \y -> ()
. But it complains because Unit didn't have a type parameter. If I had type aliases I'd make type ParamUnit[a] = Unit
yeah, we need the ability to make phantom types struct P[a](x: Int)
for instance... #83 has that.
I also think type alias may be needed... I had kind of hoped single item structs would be ergonomic enough... maybe, maybe not, but if so, we certainly need the above syntax to be legal.
nice, that solved that type error. Working on my next one. I expect this code to be hairy though
in case anyone's interested
package Example/Resultset2
struct Structure(cons: forall a, b. t[a] -> t[b] -> t[Tuple2[a,b]], car: forall a. forall b. t[Tuple2[a,b]] -> t[a], cdr: forall a. forall b. t[Tuple2[a,b]] -> t[b])
struct Id(x)
def valueCons(idX: Id[a], idY: Id[b]):
Id(x) = idX
Id(y) = idY
Id(Tuple2(x,y))
def valueCar(xy):
Id(Tuple2(x,y)) = xy
Id(x)
def valueCdr(xy):
Id(Tuple2(x,y)) = xy
Id(y)
valueStructure = Structure(valueCons, valueCar, valueCdr)
def consStruct(strucL: Structure[forall a. t1[a]], strucR: Structure[forall a. t2[a]]) -> Structure[forall a. Tuple2[t1[a], t2[a]]] :
Structure(consL, carL, cdrL) = strucL
Structure(consR, carR, cdrR) = strucR
def cons(inst1, inst2):
Tuple2(left1, right1) = inst1
Tuple2(left2, right2) = inst2
Tuple2(consL(left1, left2), consR(right1, right2))
def car(inst):
Tuple2(left, right) = inst
Tuple2(carL(left), carR(right))
def cdr(inst):
Tuple2(left, right) = inst
Tuple2(cdrL(left), cdrR(right))
Structure(cons, car, cdr)
in file: ../bosatsu_examples/resultset2.bosatsu, package Example/Resultset2, type $a74 does not unify with type (forall a. $t237[a])[?a]
33| Tuple2(left, right) = inst
34| Tuple2(cdrL(left), cdrR(right))
35| Structure(cons, car, cdr)
^^^^
I’ll see if I can get this to work tomorrow with more type annotations.
Maybe there is an improvement we can make, or maybe we just need to add more annotations around such generic types.
In your example note that Structure
has parameter t
which is a type constructor t[_]
in scala language or t: * -> *
in Haskell language. The Structure[forall a. t[a]]
is "ill-kinded". But that doesn't solve your problem to have Structure[t1]
and Structure[t2]
because what is the resultant type?
struct HKTuple[t1, t2, a](first: t1[a], second: t2[a])
is something like you want, however, the inability to set the order of the type parameters and talk about Structure[HKTuple[t1, t2]]
, (HK for Higher kinded) which is what I think you want to talk about, is the problem.
This is getting into pretty advanced space. I don't think your example would work without a lot of elbow grease in scala or haskell, so we are setting the bar pretty high here...
That said, I think it should be solvable with a few steps:
HKTuple[t1, t2]
above which is a typeconstructor that needs one more type. I think this is supported currently just by how things were implemented, but some tests can't hurt.struct HKTuple(fst: t1[a], snd: t2[a])
the order of the type parameters are the order given: t1, a, t2
which won't work since the currying in unification wants a
on the outside.Structure[forall a. t[a]]
is already ill-typed, and nothing is going to unify there, but we don't report it that way. That should be Structure[t]
, but that doesn't solve the issue because we can't name the output type currently. That said, if we allow currying, maybe we can't see that forall a. t[a]
isn't okay, because maybe t
has two type parameters, and we are saying one of them could be anything?This example is pretty complex. I'd like to put it on hold until I solve #83 and then see if explicitly giving the type works, which I think it may.
That's fascinating. Yeah, I ended up with the forall there because I couldn't see how else to express the resulting type. I definitely would drive myself mad trying to do this in Scala.
I can definitely not think about this for a while and maybe another approach will appear. In the mean time I'll write out the full API I was thinking of so that I can remember it in the future
I am trying to represent a recordset / query result using the type system.
The goal is to provide a library of functions to help dealing with these resultsets: e.g. remove a field, calculate a new field, join or union two resultsets, etc.
First attempt using lists of lists
The first attempt to describe this is using a List of List of optional values
This works, but there are definitely cons to this approach:
Second attempt attempt using lists of tuples
Rather than using lists for the individual rows, we could use tuples. This will enforce that every row will have the same number of fields, and that every column will have the same type.
Unfortunately, this doesn't really work for our use case, because:
Questions
List[List[Option[Value]]]
?