tanbro / pyyaml-include

yaml include other yaml
https://pypi.org/project/pyyaml-include/
GNU General Public License v3.0
78 stars 20 forks source link

Relative include: relativeto the current file? #23

Closed ba1dr closed 1 year ago

ba1dr commented 3 years ago

I think (IMHO) it would make more sense if relative include searches for the file in the current file's directory rather than current working directory.

Use case: when I pass config files in the command line I might have them in different directories. And if they're independent - they could load their own local extensions.

The solution to not break backward compatibility would be, for example to add a new parameter to the constructor or to initialize base_dir=None instead of default empty string.

EDIT: perhaps this is not possible to get information about current file from the node object passed to the constructor? But as soon as we can redefine reader - I see that yaml.Reader class can handle stream names for file-like objects.

tanbro commented 3 years ago

@ba1dr Thanks for your advice!

Yes, it's nice and natural to parse the relative file from where the current YAML is.

But i can not to get information about current file, as you wrote.

For the case of that "config files in different directories.", i think current API can't do that beautifully, but to use absolute path is workable.

tanbro commented 3 years ago

jinjyaml is a Jinja2 template engine integration for PyYAML.

We can include files by Jina2's include instruction:

Consider we have below YAML:

parent: !j2 |
    {% include "child-1.yml" %}
    {% include "child-2.yml" %}

then execute:

import jinja2
import jinjyaml

j2_env = jinja2.Environment(
    loader = jinja2.FileSystemLoader(searchpath=your_base_dir)
)

j2_ctor = jinjyaml.Constructor()
yaml.add_constructor('!j2', j2_ctor)

doc = yaml.full_load(yaml_string)

data = jinjyaml.extract(doc, env=j2_env)

Jinja2's FileSystemLoader would load child-1.yml and child-2.yml, relative to it's search path.

And we can even write a custom Jinja2 file loader, for particular purpose.

ba1dr commented 3 years ago

Hmm, no, I think Jinja2 would be an overkill. If using template engines - I'd better use config file on Python generated with Jinja2 rather than yaml. Even with this include feature I am not sure if it is a good idea to use it as it breaks compatibility with other languages or scripts that do not support this tag..

tanbro commented 3 years ago

Perhaps a YMAL's Json Pointer (if there be one) could be more fit for the case.

1ace commented 2 years ago

In case anyone's interested, my current workaround is this:

import contextlib
import os
import pathlib

import yaml
from yamlinclude import YamlIncludeConstructor

YamlIncludeConstructor.add_to_loader_class(loader_class=yaml.SafeLoader)

@contextlib.contextmanager
def working_directory(path: pathlib.Path):
    prev_cwd = pathlib.Path.cwd()
    os.chdir(path)
    try:
        yield
    finally:
        os.chdir(prev_cwd)

def load_config_file(file_path: pathlib.Path):
    with working_directory(file_path.parent):
        with file_path.open("r") as config_file:
            return yaml.safe_load(config_file)

But obviously, the limitation is that any 2nd+ level include is relative to the first file, not any intermediate files, but luckily that's good enough for us right now :slightly_smiling_face:

idantene commented 2 years ago

But i can not to get information about current file, as you wrote.

Actually @tanbro, you can! :) One just had to change the base_dir as they travel along, and extract the name of the file from the stream, then patch yaml.load specifically.

EDIT: Updated the snippet, this is what we now use internally in an __init__.py.

import yaml
from yamlinclude import YamlIncludeConstructor

YamlIncludeConstructor.add_to_loader_class(loader_class=yaml.FullLoader)
YamlIncludeConstructor.add_to_loader_class(loader_class=yaml.SafeLoader)
YamlIncludeConstructor.add_to_loader_class(loader_class=yaml.Loader)
YamlIncludeConstructor.add_to_loader_class(loader_class=yaml.BaseLoader)

include_tag = YamlIncludeConstructor.DEFAULT_TAG_NAME
yaml_load = yaml.load  # Save original load function

def load_yaml(stream, Loader):
    from pathlib import Path
    path = Path(stream.name)
    if include_tag not in Loader.yaml_constructors:
        return yaml_load(stream, Loader=Loader)
    previous_base = Loader.yaml_constructors[include_tag].base_dir
    Loader.yaml_constructors[include_tag].base_dir = path.parent.as_posix()
    res = yaml_load(stream, Loader=Loader)
    Loader.yaml_constructors[include_tag].base_dir = previous_base
    return res

yaml.load = load_yaml  # Use new one

del YamlIncludeConstructor
del yaml
idantene commented 2 years ago

The above would fail on strings (if used with e.g. yaml.load(f.read()), or some local definitions). One can add an isinstance(stream, io.TextIOWrapper) for validation as needed.

EDIT: Like so:

yaml_load = yaml.load  # Save original load function

def load_yaml(stream, Loader):
    from pathlib import Path
    from yamlinclude import YamlIncludeConstructor
    from io import TextIOWrapper
    tag = YamlIncludeConstructor.DEFAULT_TAG_NAME
    if tag not in Loader.yaml_constructors or not isinstance(stream, TextIOWrapper):
        # If tag is included in the stream but we can't get the file location, we can't assume
        # anything about the relative file location
        return yaml_load(stream, Loader=Loader)
    path = Path(stream.name)
    previous_base = Loader.yaml_constructors[tag].base_dir
    Loader.yaml_constructors[tag].base_dir = path.parent.as_posix()
    res = yaml_load(stream, Loader=Loader)
    Loader.yaml_constructors[tag].base_dir = previous_base
    return res

yaml.load = load_yaml  # Use new one

@ba1dr @1ace You might be interested ^

tanbro commented 1 year ago

26 provides a way to include files relatively