marzer / tomlplusplus

Header-only TOML config file parser and serializer for C++17.
https://marzer.github.io/tomlplusplus/
MIT License
1.57k stars 150 forks source link

Modify (or insert) value to a nested table using dotted paths #160

Closed user-grinch closed 2 years ago

user-grinch commented 2 years ago

Is your feature request related to a problem? Please describe. I wanted to insert a value (std::string) directly into a table using at_path(). I've tried to use the ref() method but it ends with an assert, table.at_path("header.test").ref<std::string>() = value;. Is there another way to do this?

Describe the solution you'd like (If I haven't missed anything) An API to modify (or add if non-existent) data to a nested table using dotted paths.

Additional context N/A

marzer commented 2 years ago

There is no way to do this directly because of the polymorphic nature of TOML data and the problems that presents. As an example, given this TOML:

[a]
b = { c = 99.0 }

and this hypothetical C++ code:


auto tbl = toml::parse(/* the example toml above */);
tbl.magic_insert_at_path_func("a.b.c.d[2]", "klaatu barada nikto");

Stepping through it:

  1. a and b are fine, they're already toml::tables
  2. c exists too but it's a double in our original data. For the path to work it would need to be a toml::table... what do we do here? Return some sort of error value? Throw an exception? Clobber the existing double value with a toml::table?
  3. Assuming we chose to clobber c with a new table, the next path component d should an array for the path to work. d doesn't exist at all so we just need to create a new toml::array inside c. Not too hard, but...
  4. d is now an empty array and the path specified element [2], what now? Do we default-initialize elements [0], [1] before creating [2] with the specified string value? If so, what are their types? What are their values? TOML doesn't have a null type, so we have to make them something.

Given the number of different ways the above problem(s) could be solved (what defaults to apply, what sort of error handling to use, et cetera), it's a problem best handled in user-code with a helper function to get the exact semantics you need. It does mean you need to do some string splitting on . characters, as well as a bit of recursion, but fortunately very soon I'll be merging in a feature adding a toml::path type which will greatly simplify this for you, and allows you to treat paths as first-class objects with component push/pop and such.

(As an aside, ref<> is for getting C++ references to data you already know to exist and is of matching type. Essentially it's a shortcut through the TOML value<> wrapper. You can't use it to add data that wasn't already there or to change the type of the data)

If you need any more clarifications/help, you're welcome to ask on gitter :)

marzer commented 2 years ago

As an alternative solution in the mean time, I just prototyped this:

template <typename T, typename Path>
auto build_from_path(T&& value, Path&& path_component)
{
    using component_type = std::remove_cv_t<std::remove_reference_t<Path>>;
    static_assert(std::is_integral_v<component_type> || toml::is_key_or_convertible<Path&&>,
                  "path components must be integers or strings");

    // making an array
    if constexpr (std::is_integral_v<component_type>)
    {
        toml::array arr;
        const auto index = static_cast<std::size_t>(path_component);
        arr.reserve(index + 1u);

        // backfill with integers
        while (arr.size() < index)
            arr.push_back(0);

        // add the actual value
        arr.push_back(static_cast<T&&>(value));

        return arr;
    }

    // making a table
    else
    {
        toml::table tbl;

        tbl.insert_or_assign(static_cast<Path&&>(path_component), static_cast<T&&>(value));

        return tbl;
    }
}

template <typename T, typename Path, typename... Paths>
auto build_from_path(T&& value, Path&& path_component, Paths&&... path_components)
{
    static_assert(sizeof...(Paths));

    return build_from_path(build_from_path(static_cast<T&&>(value), static_cast<Paths&&>(path_components)...),
                           static_cast<Path&&>(path_component));
}

static void merge_left(toml::table& lhs, toml::table&& rhs);

static void merge_left(toml::array& lhs, toml::array&& rhs)
{
    rhs.for_each(
        [&](std::size_t index, auto&& rhs_val)
        {
            // rhs index not found in lhs - direct move
            if (lhs.size() <= index)
            {
                lhs.push_back(std::move(rhs_val));
                return;
            }

            // both elements were the same container type -  recurse into them
            if constexpr (toml::is_container<decltype(rhs_val)>)
            {
                using rhs_type = std::remove_cv_t<std::remove_reference_t<decltype(rhs_val)>>;
                if (auto lhs_child = lhs[index].as<rhs_type>())
                {
                    merge_left(*lhs_child, std::move(rhs_val));
                    return;
                }
            }

            // replace lhs element with rhs
            lhs.replace(lhs.cbegin() + index, std::move(rhs_val));
        });
}

static void merge_left(toml::table& lhs, toml::table&& rhs)
{
    rhs.for_each(
        [&](const toml::key& rhs_key, auto&& rhs_val)
        {
            auto lhs_it = lhs.lower_bound(rhs_key);

            // rhs key not found in lhs - direct move
            if (lhs_it == lhs.cend() || lhs_it->first != rhs_key)
            {
                using rhs_type = std::remove_cv_t<std::remove_reference_t<decltype(rhs_val)>>;
                lhs.emplace_hint<rhs_type>(lhs_it, rhs_key, std::move(rhs_val));
                return;
            }

            // both children were the same container type -  recurse into them
            if constexpr (toml::is_container<decltype(rhs_val)>)
            {
                using rhs_type = std::remove_cv_t<std::remove_reference_t<decltype(rhs_val)>>;
                if (auto lhs_child = lhs_it->second.as<rhs_type>())
                {
                    merge_left(*lhs_child, std::move(rhs_val));
                    return;
                }
            }

            // replace lhs value with rhs
            lhs.insert_or_assign(rhs_key, std::move(rhs_val));
        });
}

template <typename T, typename Path, typename... Paths>
void insert_at_path(toml::table& root, T&& value, Path&& path_component, Paths&&... path_components)
{
    auto rhs = build_from_path(static_cast<T&&>(value),
                               static_cast<Path&&>(path_component),
                               static_cast<Paths&&>(path_components)...);

    if constexpr (toml::is_array<decltype(rhs)>)
    {
        merge_left(root, toml::table{ "", std::move(rhs) });
    }
    else
    {
        merge_left(root, std::move(rhs));
    }
}

In the example I gave above, it would be used like this:

auto tbl = toml::parse(/* the example toml above */);
insert_at_path(tbl, "klaatu barada nikto", "a", "b", "c", "d", 2); // inserts at a.b.c.d[2]

It clobbers conflicting keys/indices and backfills arrays with the integer 0, but it's a start. Should be relatively straightforward to modify it to suit your needs.

mnjdhl commented 11 months ago

Hi @marzer Has this been implemented and available to use?

marzer commented 11 months ago

No, and it likely never will be. I explain why in the discussion above. I recommend reading it for additional context - there's just too many different ways to implement the various edge cases so it's something I'd rather leave to users to handle in the specific way they need.

There's some helper code above that you could use as a start on implementing something that works for you, but it has some caveats.