Open nstarman opened 1 year ago
Thanks for opening the issue.
It's possible to add support for tuples via the extension interface. Here's a small example that could be flushed out for your application.
import asdf
class CustomConverter(asdf.extension.Converter):
types = [tuple]
tags = ["http://example.com/tags/tuple-1.0.0"]
def to_yaml_tree(self, obj, tag, ctx):
return list(obj)
def from_yaml_tree(self, node, tag, ctx):
return tuple(node)
class CustomExtension:
extension_uri = "http://example.com/extensions/test-1.0.0"
converters = [CustomConverter()]
tags = CustomConverter.tags
with asdf.config_context() as config:
config.add_extension(CustomExtension())
value = (1, 2, 3)
fn = 'test.asdf'
asdf.AsdfFile({'value': value}).write_to(fn)
with asdf.open(fn) as af:
rt_value = af['value']
print("====== Comparison ======")
print(f"Equality: {value == rt_value}")
print(f"Type: {type(value) == type(rt_value)}")
print("========================")
with open(fn) as f:
print(f.read())
Running this on my system (with the current asdf main but this should work with older version) produces:
====== Comparison ======
Equality: True
Type: True
========================
#ASDF 1.0.0
#ASDF_STANDARD 1.5.0
%YAML 1.1
%TAG ! tag:stsci.edu:asdf/
--- !core/asdf-1.1.0
asdf_library: !core/software-1.0.0 {author: The ASDF Developers, homepage: 'http://github.com/asdf-format/asdf',
name: asdf, version: 3.0.0.dev339+ga0778f30.d20230712}
history:
extensions:
- !core/extension_metadata-1.0.0
extension_class: asdf.extension._manifest.ManifestExtension
extension_uri: asdf://asdf-format.org/core/extensions/core-1.5.0
software: !core/software-1.0.0 {name: asdf, version: 3.0.0.dev339+ga0778f30.d20230712}
- !core/extension_metadata-1.0.0 {extension_class: __main__.CustomExtension, extension_uri: 'http://example.com/extensions/test-1.0.0'}
value: !<http://example.com/tags/tuple-1.0.0> [1, 2, 3]
...
Note that the tuple
is tagged with a custom tag. The need for this stems somewhat from the YAML standard but also from the ASDF standard.
The closest structure to a tuple in YAML is a sequence. By default sequences are loaded as lists (by pyyaml, the library asdf uses) and lists are written as sequences. To produce YAML that is closer to the standard, asdf builds off of SafeLoader and SafeDumper from pyyaml (see asdf.yamlutil for more details). The SafeDumper automatically converts tuples to sequences:
>>> yaml.dump((1, 2, 3), Dumper=yaml.SafeDumper)
'- 1\n- 2\n- 3\n'
This seems like a convenience for the user (rather than throwing a RepresenterError
when a tuple is encountered). However as you've pointed out this means that tuples do not round trip (even when using pyyaml directly).
>>> yaml.load(yaml.dump((1, 2, 3), Dumper=yaml.SafeDumper), yaml.SafeLoader)
[1, 2, 3]
The asdf library could raise this (or some other) exception when a tuple is encountered or define a custom tag to allow mapping tuples to a tagged YAML sequence (like is done in the above example). As asdf is focused on supporting the standard (and leaves non-standard tags to extension) adding a custom tag would involve updating the ASDF standard to add a new tag for tuples/immutable sequences. This is already done for things like complex.
I'm not seeing any documentation for this behavior, perhaps it would fit in where the documentation describes the Data Model. I'll take a stab at adding a note about tuple handling.
If I save a dictionary with tuple values, the values are loaded as lists. Tuples are immutable among other useful properties. It would be great if data structures could losslessly round-trip though the ASDF format.