SimplyKnownAsG / yamlize

Python YAML serializing library
Apache License 2.0
11 stars 5 forks source link

How to include the local data type specifier? #18

Open sphh opened 1 year ago

sphh commented 1 year ago

Is it possible to add a local data type specifier to the dumped YAML string?

Background:

If I use ruamel.yaml for a simple class, I get a local data type specifier in the YAML string:

import sys
from ruamel.yaml import YAML

class A:
    def __init__(self):
        self.a = 0

yaml = YAML()
yaml.register_class(A)

a = A()
yaml.dump(a, sys.stdout)

which prints

!A
a: 0

Note the local data type specifier !A.

But if I derive class A from yamlize.Object, this local data type specifier is missing:

import sys
from ruamel.yaml import YAML
import yamlize

class A(yamlize.Object):
    a = yamlize.Attribute()
    def __init__(self):
        self.a = 0

yaml = YAML()
yaml.register_class(A)

a = A()
yaml.dump(a, sys.stdout)

I get

a: 0

Note that there is no local data type specifier !A included.

The same happens when I use A.dump(a). I know, the specifier is not needed, if loaded with A.load('a: 0'), but sometimes you still want to have that specifier included because you do not know the class in advance. Hence my question: How do I dump a class derived from yamlized.Object with the local data type specifier included in the YAML string?

SimplyKnownAsG commented 1 year ago

Very nice MWE, thanks for that! Off hand, I'm not entirely sure. I'm also not sure it is in line with the intended use cases. Can you explain the use case that this would be necessary?

The MWE portrays this as possibly a root node. The subclassing example in the README shows a way to determine the class of an object based on the value of an attribute. Could this potentially work for your use case?

sphh commented 1 year ago

Let my try to explain it with a MWE:

import sys
from ruamel.yaml import YAML
import yamlize

class A(yamlize.Object):
    aa = yamlize.Attribute()
    def __init__(self):
        self.aa = 0

class B(yamlize.Object):
    a = yamlize.Attribute(type=A)
    def __init__(self, a):
        self.a = a

yaml = YAML()
yaml.register_class(A)
yaml.register_class(B)

a = A()
b = B(a)
yaml.dump(b, sys.stdout)

a:
  aa: 0

I can load it with B.load('...'), which is – how I understand it – the indented use case for the yamlize module.

In my case, I want be able to copy both a and b to the clipboard and then paste it. For serialization I want to use YAML. If I want to copy a and b, I would use

yaml.dump([a, b], sys.stdout)

and copy

- &id001
  aa: 0
- a: *id001

to the clipboard.

When pasting I do not know, which one was the original class and I would need the local data specifier when getting and loading that yaml string from the clipboard.

I could use a dictionary like

yaml.dump({'A': [a], 'B': [b]}, sys.stdout)
A:
- &id001
  aa: 0
B:
- a: *id001

but I failed to find a way to load this with the references kept intact …

sphh commented 1 year ago

Hm, if I followed you suggestion using subclassing, I would have to add a type = yamlize.Attribute(type=str) to all classes (without a default but setting this attribute during initialization) and write some sort of yamlize.Object master class, which is responsible for loading. This probably works.

But my second use of the YAML serialization is to save the project in a file. There I know exactly the structure (yamlize.Objects as yamlize.Attributes in other yamlize.Objects) and the additional type attribute would ‘pollute’ the project's yaml file. The type attribute could also contradict the type based on the element's position in the file (if a mistake is made when that file is edited or generated externally). Hence I also do not want the local data type specifier to be included in that file. Hence my dual approach:

Maybe I am thinking too complicated and there is an easier way to accomplish this?

sphh commented 1 year ago

@SimplyKnownAsG: I wonder if you could get your head around my use case and if you had any insights how to solve it. I still believe including a local tag specifier might be the easiest and cleanest solution. Do you have any other idea?