lichray / formatxx

Printf for C++
http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2013/n3716.html
20 stars 1 forks source link

Extensibility? #2

Open jan-hudec opened 11 years ago

jan-hudec commented 11 years ago

It would be nice if there was an extension mechanism for passing custom parameters for custom object formatting.

Sometimes custom objects have some formatting options that are either difficult to express in the regular stream settings or doing so would be extremely cryptic. For example a date object can have several formats, but there are no format flags for them and if one resorted to abusing say precision as selector, it would be totally cryptic.

The streams themselves have iword and pword methods (in std::ios_base) for this purpose. I don't see any good way for user to select the index, but at least puts could set one specific index (say std::puts_word_index()) to something from the format string. I have two ideas:

That would allow nice syntax like formatting say date as %[Y-m-d]s given proper support in it's operator<<. It would also be useful to code that would want to extend puts by wrapping objects and providing extra formatting for them in the wrappers. I recently did that with boost::format to get escape sequences when logging strings with unprintable characters, but the format syntax is a bit ugly because boost::format does not have any extension mechanism either, so I needed to abuse some existing formatting flags.

lichray commented 11 years ago

So let the operator<< of the user-defined type parse the format string internally? Such an interface is kinda hard to use...

jan-hudec commented 11 years ago

Only the non-standard part of the format needs to be parsed by the operator<< internally. All the standard parts would be processed normally. So:

lichray commented 11 years ago

On Fri, Mar 1, 2013 at 11:22 AM, jan-hudec notifications@github.com wrote:

the few classes that have a need for something more complex would use the most complex mechanism and parse the extended part of the format.

I understand this need. And I want to pass a map to std::putf, and say

putf("%[key_name]", kv_pairs);

If "key_name" can be seen by operator<<, any needs can be met.

And often they wouldn't really need to parse it too much; for example a date class could simply pass it to strftime (so the format would actually look like %[%Y-%m-%d]s).

Exactly. Just without the 's' postfix. No longer meaningful.

However, i/pword() seems to be a none approach. To open these interfaces to a customized operator<<, the ostream need to distinguish it from others, and the ostream way is to use one more xalloc(), like putf_word_index(), as you said. But first, the interface is complex; second, this mechanism is not designed for our needs -- we don't to remember the old values.

I think we can simply distinguish those types with special needs just using their types, by creating wrappers, like

putf("%[%Y-%m-%d]", self_format(tm));

A concept check is attached.

Zhihao Yuan, ID lichray The best way to predict the future is to invent it.


4BSD -- http://4bsd.biz/

lichray commented 11 years ago

Grrrrr. Paste the code here:

#include <iostream>
#include <map>

namespace stdex {

struct _self_formatter_base {
    friend void _set_current_fmt_string(_self_formatter_base&,
        char *, size_t);

    char *get() const {
        return ptr_;
    }

    size_t size() const {
        return len_;
    }

private:
    // replace these shit with std::string_ref
    char *ptr_;
    size_t len_;
};

void _set_current_fmt_string(_self_formatter_base& v, char *p, size_t l) {
    v.ptr_ = p;
    v.len_ = l;
}

template <typename T>
struct self_formatter : _self_formatter_base {
    self_formatter(T const& v) : ref_(v) {}

    T const& ref() const {
        return ref_;
    }

private:
    T const& ref_;
};

template <typename T>
auto self_format(T const& v) -> self_formatter<T> {
    return v;
}

}

typedef std::map<std::string, int> my_type;

// write your parser here
std::ostream& operator<<(std::ostream& os, stdex::self_formatter<my_type> nv) {
    // std:string_ref
    std::string s(nv.get(), nv.size());
    os << nv.ref().at(s);
    return os;
}

int main() {
    my_type v = { {"height", 25}, {"width", 80} };
    // pass this to std::putf,
    auto h = stdex::self_format(v);
    auto w = stdex::self_format(v);
    char format_like[] = "terminal is %[height]x%[width]\n";
    // so that std::putf can
    std::cout.write(format_like, 12);
    stdex::_set_current_fmt_string(h, format_like + 14, 6);
    std::cout << h;
    std::cout.write(format_like + 21, 1);
    stdex::_set_current_fmt_string(w, format_like + 24, 5);
    std::cout << w;
    std::cout.write(format_like + 30, 1);
}
jan-hudec commented 11 years ago

That makes sense. I would find it even nicer if the cast was made implicit, say if a function advanced_format(std::ostream &, T, std::string_ref), found by ADL, is defined for T, than use it instead of operator<<(std::ostream &, T). Or possibly by specializing some trait, though the function seems easier to me.

Meanwhile I was looking at some other boost libraries and noticed another use-case in Boost.Locale and I think it makes sense to try to cover that too. They are defining custom stream manipulators, which are than interpreted by their custom implementation of num_put facet. And they define format similar to this proposal, but using C#-style formatting sequences (like {1}, {2,date} etc.) that can call these custom manipulators. The advantage of this approach is that it is usable with both operator<< and putf. Disadvantage is that defining such custom manipulators is somewhat complicated.

I can imagine it could be possible to register such custom manipulators for use with putf. There would be a function, say register_format, with static variant for global registration and per-stream variant, for registering such custom formatter. To fit in printf-style format I think the registration would be done for non-standard format characters. If different format style was used, other options are possible. The function would need two overloads:

register_format(wchar_t fmt_char, ostream &(*manip)(ostream &));
template<typename M>
register_format(wchar_t fmt_char, M (*manip)(string_ref));

The first defines simple manipulator, the second defines something called as stream << manip(fmtspec). The type needs some polishing. Internally it would probably generate and register function<ostream&(ostream &, string_ref)>.

So for example Boost.Locale boost::locale::as::time and boost::locale::as::ftime could be respectively registered as:

register_format('t', &boost::locale::as::time);

and

register_format('T', &boost::locale::as::ftime);

and used respectively as

time_t now = time();
cout << putf("default=%t, iso=%[%Y-%m-%dT%H:%M:%S]T", now, now);

which would call

cout << boost::locale::as::time << now << boost::locale::as::ftime("%Y-%m-%dT%H:%M:%S") << now;

internally.

Alternatively a custom formatting function like in the initial paragraph could somehow be registered, but the trouble is that in this case the arguments are just integers rather than specific type, so function can't be overloaded for them and registering a function would require some kind of type erasure in style of boost::any, which would be both significant effort and I suspect significant performance penalty too.