c42f / tinyformat

Minimal, type safe printf replacement library for C++
531 stars 75 forks source link

Support for wstring in the future? #41

Open fstrugar opened 7 years ago

fstrugar commented 7 years ago

(would be nice) :)

c42f commented 7 years ago

Can you expand on your use case? Currently I assume that users are dealing with "arrays of char" and that people should just use a variable width encoding (ideally UTF-8) if they want support for additional characters. Assuming UTF-8 seems good enough for the kind of portable command line applications I tend to write, where the conversion to UTF-16 can be handled at the windows API boundary if you happen to be using windows.

So I personally don't have much use for this, and I probably won't have time to implement it myself. However, I'm open to pull requests, with one major requirement: they shouldn't increase the size of the codebase by a large amount - it's already too large ;-)

So what do you want to be able to write, and what API should it follow? Currently tinyformat is basically "[sf]printf for C++". Are you after an equivalent of the wprintf/fwprintf/swprintf family of C functions?

fstrugar commented 7 years ago

Hi Chris,

Are you after an equivalent of the wprintf/fwprintf/swprintf family of C functions?

In short, yes, exactly! Basically I've got string/wstring format support in my codebase, based on _vsnprintf / _vsnwprintf - wstring being used a lot for UI and is a lot easier to handle/manipulate than UTF-8. Thanks for the feedback - I guess at some point I'll have to implement it myself, in which case I'll do a pull request (if it doesn't explode the code size :) ).

All the best, Filip

c42f commented 7 years ago

Thanks for the feedback, I think equivalents of wprintf and friends are probably in scope for this library, though a lot of things which are currently plain old functions would have to become function templates.

pip010 commented 5 years ago

Ha ... i was going to open the same issue and even PR ! Good I decided to look around.

@c42f The problem with just adopt UTF8 encoding for narrow chars (char*) is that it won't work on windows. There is no UTF8 support , and for good historical reasons, so long story short unicode only works as UTF16 and any 8bit char is interpreted as windows code-page: https://en.wikipedia.org/wiki/Windows_code_page

There are quite nice articles I had bookmarked if interested?

Anyway, we just need an overload for each function accepting char* or/and std::ostream indeed an alternative (but will involve more code) is templates, which is a clean solution but at the expense of compilation time.

Shell I give it a try? What design do you prefer: overloads or templates specialization?

c42f commented 4 years ago

There is no UTF8 support

Well there's MultiByteToWideChar/WideCharToMultiByte which allow the UTF16 <--> UTF8 conversion fairly easily especially if wrapped in a convenience function. But yes, using UTF-8 directly with the windows APIs isn't possible.

If you want to give it a go, I suggest we'll just have to template all the things otherwise there will be a lot of code duplication. Basically every API function dealing with const char* and ostream will need to be templated on the char type CharT and take const CharT* and basic_ostream<CharT> instead. As is done at the moment I suggest we just disallow mixing of multibyte and wide encodings. ie disallow tfm::printf("%ls", L"blah") and tfm::printf(L"%hs", "blah") as the standard streams don't support conversion of encodings.

After that's done we'll have to assess the damage ;-)

pip010 commented 4 years ago

100% on the same page; "After that's done we'll have to assess the damage ;-)" that's stopping me it is alot of change, though trivial :( depends whether we continue using it at my company, then I might invest some quality time to update the lib and push upstream! Let's keep this ticket open a bit more OK!?