Closed simonh1000 closed 3 years ago
@KtorZ if you have some time to look at this
@simonh1000 from reading a bit the cbor spec, it seems to me that your node code is taking advantage of the fact that CBOR in theory does not need a schema and can be decoded as a succession of "stuff" where "stuff" is one of a few major types (numbers, strings, bytes, arrays and maps). And so decodeAllSync
generates a list of "stuff" with different types, and you only care about some of that stuff, at a given position.
From reading the decoding API in the elm-toulouse/cbor, it seems to me there is no type defined to cover all things that may appear in a CBOR encoded bytes sequence. Something like
type CborData
= CborInt Int
| CborFloat Float
| ...
And since there does not seem to be such a type, it seems to me that a function like decodeAllSync
is currently not possible. I think the current API has been designed in scenarios where users know the schema of the data and want to decode it all. Instead your use case seems to be that you only know a "partial schema" or at least, you are only interested in part of the data.
Am I correct in my understanding of your issue?
I don't specifically need generic decoding. The top level data is - I think - a tagged list, of which I want the 2nd element. After thinking some further I thought I could get the following to work, but something is still not right (on re-reading the docs, perhaps I should not have expected maybe
to do what i needed?).
-- CD.tagged (CDTag.Unknown 18) (CD.list <| CD.succeed ())
CD.tagged (CDTag.Unknown 18) (CD.list (CD.maybe CD.bytes))
|> CD.andThen
(\( _, lst ) ->
CD.succeed lst
--case lst of
-- _ :: _ :: gpData :: _ ->
-- CD.succeed gpData
--
-- _ ->
-- CD.fail
)
The commented out code 'works' (it reports ()
), which seems confirm I'm getting some of the shape correct, but the uncommented code fails, and I'm not yet sure why
I suspect that CDTag.Unknown 18
is also too specific for data in general, but this is what my (Belgian) data uses.
@simonh1000 do you have by any chance a little excerpt of the CBOR-encoded string you're trying to decode?
In your excerpt above, I find the use of andThen
suspicious, for serialized structures like that are rarely nested though sometimes do embed cbor-in-cbor.
I'd rather not share my data, but you can extract something similar from you EU 'pass sanitaire' - it's the data in the QR code. You have to remove the "HC1:" at the beginning to get a base45 string that you need to decode and then inflate.
Here's my full code:
module Main exposing (..)
import Base45
import Cbor.Decode as CD
import Cbor.Tag as CDTag
import Inflate
raw =
"NCF...TFB-D"
init _ =
let
_ =
raw
|> Base45.decode
|> Result.andThen (Inflate.inflateZLib >> Result.fromMaybe "inflate failed")
|> Result.andThen (CD.decode decCbor >> Result.fromMaybe "cbor failed")
|> Debug.log ""
in
( (), Cmd.none )
decCbor =
--CD.dict CD.string <| CD.succeed "something"
--CD.list CD.bytes
CD.tagged (CDTag.Unknown 18) (CD.list <| CD.succeed ())
--CD.tagged (CDTag.Unknown 18) (CD.list (CD.maybe CD.bytes))
|> CD.map
(\( _, lst ) ->
lst
--case lst of
-- _ :: _ :: gpData :: _ ->
-- CD.succeed gpData
--
-- _ ->
-- CD.fail
)
--CD.tag
-- |> CD.andThen
-- (\tag ->
-- let
-- _ =
-- Debug.log "tag" tag
-- in
-- CD.fail
-- )
update _ m =
( m, Cmd.none )
subscriptions _ =
Sub.none
main : Program () () msg
main =
Platform.worker
{ init = init, update = update, subscriptions = subscriptions }
Arf. I got my hands on some data and I see the issue now. The data is a tagged array, with heterogeneous elements: a bytestring, a map, another bytestring and another bytestring.
However, the library only allows to decode lists for which all elements have the same type but not arbitrary arrays. That'd be a nice feature to add.
correct. As i only need specific indices of the array, an index
function would work in this case too, but perhaps is a less general fix. I suspect I will have a related ask for Dictionaries once I get through the top level of the data
Hey, I had a quick stab at it this morning. Looking at the EU Digital Green Certificates, we can see that the outer-most structure is a tagged COSE envelope. Using the new primitives introduced in #2, you should be able to decode it as such:
type alias CoseEnvelope =
{ protected : Bytes
, unprotected : ()
, payload : Bytes
, signature : Bytes
}
let decoder =
D.tagged (Tag.Unknown 18) <|
D.array <|
D.map4 CoseEnvelope
D.bytes
(D.record <| D.succeed ())
D.bytes
D.bytes
(Note that the unprotected
field really is an empty map in the specs). The payload
and protected
are then cbor-encoded structure, which can also be decoded (here you could use D.bytes |> D.andThen ...)
if you wanted to do it in one go; I'll see maybe to also provide a nice primitive for that, like 'nested' or something like that).
If you're interested, I also found some nice test data in the official repository. For example, the second QR code, once base45-decoded and deflat gives you the following encoded COSE/CBOR bytestring: https://github.com/eu-digital-green-certificates/dgc-testdata/blob/main/FR/2DCode/raw/DCC_Test_0002.json#L27
Let me know if #2 helps, I'll take the time to make it a proper release somewhere this week.
Sweeet - here's the final result
type alias CoseEnvelope =
{ protected : Bytes
, unprotected : ()
, payload : Bytes
, signature : Bytes
}
decCoseEnvelope =
CD.tagged (CTag.Unknown 18) <|
CD.array <|
CD.map4 CoseEnvelope
CD.bytes
(CD.record <| CD.succeed ())
CD.bytes
CD.bytes
type alias GreenPass =
{ country : String
, d1 : Int
, d2 : Int
, passData : PassData
}
decGreenPass =
CD.record <|
CD.map4 GreenPass
(CD.pair CD.int CD.string |> CD.map Tuple.second)
(CD.pair CD.int CD.int |> CD.map Tuple.second)
(CD.pair CD.int CD.int |> CD.map Tuple.second)
(CD.pair CD.int decodePassOuter |> CD.map Tuple.second)
decodePassOuter : CD.Decoder PassData
decodePassOuter =
CD.record <|
CD.map Tuple.second <|
CD.pair CD.int decodePass
type alias PassData =
{ vaccine : List Vaccine
, dob : String
, user : User
}
decodePass =
CD.record <|
CD.map3 (\( _, v ) dob ( _, u ) -> PassData v dob u)
(CD.pair CD.string <| CD.list decodeVaccine)
ds
(CD.pair CD.string decodeUser)
type alias Vaccine =
{ dose : Int
, make : String
}
decodeVaccine : CD.Decoder Vaccine
decodeVaccine =
let
dec1 =
CD.map3 (\ci co dn -> ( co, dn )) ds ds di
dec2 =
CD.map3 (\dt is ma -> ()) ds ds ds
dec3 =
CD.map4 (\mp sd tg vp -> ()) ds di ds ds
in
CD.record <|
CD.map3 (\( co, dn ) _ _ -> Vaccine dn co) dec1 dec2 dec3
type alias User =
{ given : String
, family : String
}
decodeUser : CD.Decoder User
decodeUser =
CD.record <|
CD.map4 (\fn gn _ _ -> User gn fn) ds ds ds ds
ds =
CD.pair CD.string CD.string |> CD.map Tuple.second
di =
CD.pair CD.string CD.int |> CD.map Tuple.second
I believe, now fixed in
https://package.elm-lang.org/packages/elm-toulouse/cbor/1.1.0
See the CHANGELOG on: https://github.com/elm-toulouse/cbor/releases/tag/1.1.0
Thanks for the feedback, it's heartwarming to see that this was any useful for someone :pray: Have great one!
Et a vous merci beaucoup.
Hi, I was excited to find your library as I investigate the EU covid passport, as it uses CBOR. However, I am unable to find a way to extract the data I wanted. Here is the code for node-cbor that does work:
The issue (or at least my first one is) is that result.value looks like
Note that is not homogeneous. In practise, I want index 2, but there is neither an
index
nor aoneOf
decoder that I could use.Can you see a way to do this?