vapourlang / vapour

Typed superset of R
http://vapour.run
Apache License 2.0
194 stars 3 forks source link

Subsetting vectors #67

Closed jonocarroll closed 1 month ago

jonocarroll commented 2 months ago

Perhaps a very recent change, but this code (seen in #62) no longer works (vapour@0.0.5)

let x: int = (1, 2, 3)
x = x[4]

with the error (in the LSP and transpiler)

cannot use `[` on type `int`

Similiarly for a character vector. No harm, no foul - new language, so breakage is entirely understandable.

JohnCoene commented 2 months ago

It seems I broke this long ago and had no test to catch it.

I can see the error, it's right here

https://github.com/vapourlang/vapour/blob/5b69a2805629476032c1e5c71fa445700bd6807a/walker/types.go#L116

The function checks that "accessor" infixes, at least that's what I called them ([, $, [[) are executed on the right type but it seems it is not justified?

Testing some things in R it seems that we can call [, $, [[ on pretty much anything but surely past me must have put this check in for a reason; I just can't quite find which it is...

If you have any ideas/pointer please do let me know. Perhaps that should be removed entirely?

Thanks for reporting.

EDIT: this also links to #62 about which I'm a bit annoyed because I'm not sure how it should be handled.

jonocarroll commented 2 months ago

I wonder if the complication is the lack of scalars in R? In any other language I'd expect to need to differentiate between int and vec<int>, for which only the latter can be subset. In that case, things become a bit cleaner because the accessor will only produce a scalar int, over which one would need to map. You can index into a length-1 vector, but it will either be the value itself or one of the NAs.

In terms of the NA issue, both (1:3)[4] and (1:3)[NA] produce a vector of NA_integer_ so I wonder if instead of a newtype int | na it's reasonable to expect just int, of which NA_integer_ is a value.

JohnCoene commented 2 months ago

It could be because there is not scalar indeed.

I think NA should not be a type, it's just a value as you indicate. I would have liked to handle it nicely because it can cause issues downstream (e.g.: if(NA){}) but it may not be possible.

Though for some reason accessing out of range on a list (and thus data.frame results in an error instead of NA.