tweag / HaskellR

The full power of R in Haskell.
https://tweag.github.io/HaskellR
Other
585 stars 47 forks source link

R List type being read as Vector using the hexp ViewPattern #214

Closed asrvsn closed 9 years ago

asrvsn commented 9 years ago

I am currently casting R values to a custom datatype, MyValue using this code:

castSEXP :: R.SEXP s a -> [MyValue]
castSEXP x = case x of
  (hexp -> H.Nil)       -> [MyNull]
  (hexp -> H.Real v)    -> map MyDouble $ SV.toList v
  (hexp -> H.Int v)     -> map (MyInt . fromIntegral) $ SV.toList v
  (hexp -> H.Logical v) -> map (MyBool . fromLogical) $ SV.toList v
  (hexp -> H.Char v)    -> [MyString . bytesToString $ SV.toList v]
  (hexp -> H.String v)  -> [castString v]
  (hexp -> H.Symbol s s' s'')  -> castSEXP s
  (hexp -> H.List a b c) -> [MyList (castSEXP a) (castSEXP b) (castSEXP )]
  (hexp -> H.Special i) -> [MyInteger $ fromIntegral i]
  (hexp -> H.DotDotDot s) -> castSEXP s
  (hexp -> H.Vector len v) -> map castR $ SV.toList v
  (hexp -> H.Builtin i) -> [MyInteger $ fromIntegral i]
  (hexp -> H.Raw v) -> [MyString . bytesToString $ SV.toList v]
  (hexp -> H.S4 s) -> castSEXP s
  _ -> [MyError "Could not cast R value."]

However, when I create a list in R using list(a=1,b=2), I receive only the values 1, 2 because the case that fires here is H.Vector, not H.List as I would expect. Am I unboxing these values incorrectly, or is this expected behavior? If the latter, how can I obtain the keys a, b from an R list? (also, I'd like to note that the H.S4 case produces a null value here when using R S4 objects.)

qnikst commented 9 years ago

Hello, @ooblahman thanks for the report.

I can reproduce this problem, and see following:

  1. hexp view type depending on it's SEXPTYPE from the header, and we converted values that have LISTSXP to List and one that have VECSXP to vector. However it's not correct as LISTSXP is only for pair lists. However as documentation says:

Pairlists (LISTSXP, the name going back to the origins of R as a Scheme-like language) are rarely seen at R level, but are for example used for argument lists.

so list and vector are have same SEXPTYPE. From this point of view it's expected that H.Vector matches.

However I agree that it's bad that we can't distinguish between lists and vectors the same way as R does.

  1. Currently there is no good API to read names from the SEXP (R structure). However until we will roll out good solution it's possible to use names(x) as a workaround to get list of names for the structure. Their type will be either NULL if there is no names, or String.

We need to think about best way forward here.

mboes commented 9 years ago

@qnikst I disagree that there is any "problem" here in inline-r: we're just passing on R's semantics, however unintuitive they may be. In R, the type called "list" in the surface syntax is called VECSXP internally, i.e. the generic vector type. The LISTSXP internal type name corresponds to "pairlist" in the surface syntax. Confusing I know, but naming things differently than R does would be even more confusing.

qnikst commented 9 years ago

Yes, I totally agree, I just rechecked documentation that says

Many of these will be familiar from R level: the atomic vector types are LGLSXP, INTSXP, REALSXP, CPLXSP, STRSXP and RAWSXP. Lists are VECSXP and names (also known as symbols) are SYMSXP.

And also R sources that confirm this:

https://github.com/wch/r-source/blob/trunk/src/main/util.c#L198-L228

mboes commented 9 years ago

@ooblahman as @qnikst says, R has a function names() for getting the list of keys. We could wrap it in Haskell function in inline-r but calling it in a quasiquotation such as [r| names(x_hs) |] should work just as well.

asrvsn commented 9 years ago

thanks! good workaround.