Dynamic document mapping

pkese commented 5 years ago

I'm trying to parse some real world manually written documents with some type inconsistencies:

{
  a_func: ["Math.negate"]
  b_func: ["Math.divide", 1000]
  c_func: "Math.abs"
}

In this case there are just one or two fields like this in otherwise very large documents, so Legivel technically gets me to 99% of my task except for this specific case.

Would it be somehow possible to tell Legivel to map such values into some sort of parsed abstract document, like:

type MyDoc = Map<string, Legivel.RepresentationGraph.Node>

Or optionally let users provide custom constructors:

type MySubDoc =
  | SomeDiscriminatedUnions
  | ...
with
  [<LegivelCustomMapper>]
  static member myCustomMapper : Node -> InstanceOfMySubDoc

type MyDoc = Map<string, MySubDoc>

Sorry for bugging you so much. It seems I'm trying to push Legivel into places it hadn't been intended to for. I've been looking at other Yaml libraries for dotnet, but none have proper support for F# discriminated unions.

fjoppe commented 5 years ago

Legivel Mapper currently does not support the mapping to Node, which is the output from the parser. However I did have the intention for people to extend or customize the Mapper for their own purposes. I'm also writing something for Raml, to test this customizibility. I see points for improvement.

But you can start anyway, You can have a look in this source, and check for BuildInTryFindMappers and how it is used. If you manage to build your own mapper, you won't require "Node" in your contract.

pkese commented 5 years ago

Aha, got it:
I have to add my custom mapper to the list akin to BuildInTryFindMappers and initialize my parser with CustomDeserializeYaml using that list.
I think it is quite clear.

The thing that I'm not quite familiar with is how to invoke my particular mapping for my type. Let's say that for the above case (a_func, b_func, c_func) I'd model that with:

type InvokeFunc = { funcName: string; argument: string option }

So my type then is a Record and I should then somehow push that into RecordMappingInfo.TryFindMapper (e.g. wrap that one with my own function)
or
add another MyCustomRecordMappingInfo.TryFindMapper and just check if the record type name is InvokeFunc and construct a mapper for that?
If I add my mapper to the beginning of the list (before RecordMappingInfo.TryFindMapper) does my custom mapper get to choose first and override the built-in?

fjoppe commented 5 years ago

I was thinking of creating a custom type ParamList which inherits from List<string>.

You can then make a function like:

type MyTarget = {
  a_func: ParamList
  b_func: ParamList
  c_func: ParamList
}

The Custom Mapper can detect whether it is type ParamList and return a mapper. During mapping, you can accept both scalars and sequence nodes to fill your target object.

I think in this case it would be wise to put your custom mapper first, to not confuse it with the generic types.

Maybe you only need to re-create this function.

There is one "but" in this process. I've been hiding quite a lot of data and funcs to improve dev-experience (including my own), and I've found that I've been a bit too enthousiastic doing this...

cmeeren commented 3 years ago

I too would love the ability to simply use Node or similar in my model, like JsonElement from System.Text.Json, and be able to fully deserialize it later during runtime when I know the target type (which depends on data from other sources not available at compile-time or even when first deserializing).

fjoppe commented 3 years ago

I hope I understand your question correctly. The Yaml Parser provides functionality to convert yaml text to a generic internal representation. The Yaml Mapper, maps the generic format to a specific Type.

So if you want to intervene between parser and mapper, you must look at the mapper entry function here, with special attention to ParseYamlToNative. This is the intervention point.

Function ParseYamlToNative converts yaml text to generic native here.

You should be able to create your own custom implementation with this info - assuming I understand your question correctly.

cmeeren commented 3 years ago

Thanks. Unfortunately I am unable to divine those implementation details (or at least undocumented parts of the API) from the source code, and I don't see how I can adapt my code to support this.

In any case, deserializing directly to a dynamic Node or something à la JsonElement in System.Text.Json would be a much better experience than having to split up the inner "pipeline" of Legivel and sow it back together and hope I'm doing it right. :)

Another alternative would be to just convert from YAML to JSON and use System.Text.Json.

deviousasti commented 3 years ago

In my case, the need is even simpler. The values are generally float, but some of the values don't have a decimal point. This causes the parsing to fail saying int where float was expected, which is rather strict. I can't set it to a string either, because then it would fail for int and float.

Some kind of intermediate type would be great.

deviousasti commented 3 years ago

I tried adding ScalarToNativeMappings add conversions from int and float to decimal, but it looks like the mapper is given what is interpreted and if it doesn't match either one it throws. It looks like it's not possible to have a many to one mapping by design.

This sort of strikes me as odd for a yaml library, because yaml by design is supposed to be written by humans, and humans aren't strict about conforming exactly to types.

fjoppe / Legivel

Dynamic document mapping #18