tanbro / pyyaml-include

yaml include other yaml
https://pypi.org/project/pyyaml-include/
GNU General Public License v3.0
78 stars 20 forks source link

Recursive includes do not work with nested directories #9

Closed sam-s closed 4 years ago

sam-s commented 4 years ago

Here is the test case:

        os.makedirs("foo/bar")
        with open("foo.yml","w") as fd:
            fd.write("!include foo/bar.yml\n")
        with open("foo/bar.yml","w") as fd:
            fd.write("!include bar/zot.yml\n")
        with open("foo/bar/zot.yml","w") as fd:
            fd.write("foo: 42\n")
        YamlIncludeConstructor.add_to_loader_class(yaml.FullLoader)
        with open("foo.yml") as fd:
            self.assertEqual({"foo":42}, yaml.load(fd, Loader=yaml.FullLoader))
        os.remove("foo/bar/zot.yml")
        os.remove("foo/bar.yml")
        os.remove("foo.yml")
        os.removedirs("foo/bar")

it would work if I write the full path foo/bar/zot.yml into foo/bar.yml, but this is counterproductive (what if bar.yml is included from another place or used separately).

Would it be possible for include to take the path relative to the directory of the node that does the include? (cf. https://gist.github.com/joshbode/569627ced3076931b02f). Or maybe it is already possible? Thanks!

sam-s commented 4 years ago

Here is what works for me:

import yaml
class RecursiveLoader(yaml.FullLoader):  # pylint: disable=too-many-ancestors
    """YAML Loader with recursive `!include` constructor.
    https://gist.github.com/joshbode/569627ced3076931b02f
    https://stackoverflow.com/q/528281/850781"""
    def __init__(self, stream):
        try:
            self._root = os.path.dirname(stream.name)
        except AttributeError:
            self._root = os.path.curdir
        super().__init__(stream)
    def include(self, node):
        """Include the file referenced at the node."""
        filename = os.path.join(self._root, self.construct_scalar(node))
        extension = os.path.splitext(filename)[1]
        with open(filename, 'r') as fd:
            if extension in ('.yaml', '.yml'):
                return yaml.load(fd, Loader=self.__class__)
            if extension in ('.json', ):
                return json.load(fd)
            return fd.readlines()
yaml.add_constructor('!include', RecursiveLoader.include, RecursiveLoader)

Thanks!

tanbro commented 4 years ago

Recursive includes are complex ...

But if the problem is that the Tag can not get the pathname correctly, maybe absolute absolute pathname or use a base_dir for the Tag could solve:

The constructor searchs including files with iglob, from CWD.

And it's construct function has an argument base_dir, default to None.

can this help?

Thanks!

sam-s commented 4 years ago

Thank you for your reply. I think every include should be taken relative to the including file, not the top-level file. Abs path in YamlIncludeConstructor(base_dir=..) will not help.

tanbro commented 4 years ago

Emm, each including file is relative to the file includes them ... it's more like XML Pointer or $ref in OpenAPI spec.

Also i met a similar situation recently - seperated YAML config files on S3.

i think https://tools.ietf.org/html/rfc6901 describes things for that.