Closed RhoSigma-QB64 closed 9 months ago
Perhaps instead of a separate setting we could always use CF_UNICODETEXT
and convert it into UTF-8. I suspect that would match what the other platforms do already, and if the text is regular ASCII then the result is the same as right now (since ASCII is valid UTF-8).
In general that could work, but would pass the buck to us. What happens if there's a unicode on the clipboard which is not available in the currently set IDE codepage? We would need to handle that in any way to avoid further complaints.
If we make it an explicit parameter, then we can always say: "You requested the clipboard to operate with unicode, although QB64 isn't capable to handle that, so you're responsible to handle your data as needed in your application."
In general that could work, but would pass the buck to us. What happens if there's a unicode on the clipboard which is not available in the currently set IDE codepage? We would need to handle that in any way to avoid further complaints.
We wouldn't be doing any conversion, we would just leave the UTF-8 characters in the string as-is. That's already what happens on Linux (and very likely Mac OS) since they use UTF-8 for everything, so it's really not new behavior. In the Wiki we can just clarify that it's UTF-8 text that is given back (which is valid ASCII as long as no Unicode characters are present).
Sounds like a way to go, but wouldn't we break the IDE's internal copy'n'paste behavior then? I mean the language code itself is pure english and many coders will make their programs in english, even if it's not their native language.
I made some tests and see we can not even use extended ASCII chars in variable/function/type names, but we can use it in literal strings and comments. So what would happen, if e.g. Petr wants to copy'n'paste some of his czech worded code?
So for me it looks like we've to do conversion in any place. It's not a problem in the above scenario, as the copied chars are obviously available in the current codepage, but simply leave them as UTF-8 would screw up Petr's text as soon he pastes it, even if it's in the same program.
That's why I tend to leave the clipboard operations as-is and rather use an optional flag/parameter for unicode, if somebody really needs it and is prepared to deal with the unicode stuff himself.
I made some tests and see we can not even use extended ASCII chars in variable/function/type names, but we can use it in literal strings and comments. So what would happen, if e.g. Petr wants to copy'n'paste some of his czech worded code?
Yeah it's a good point, that's quite annoying :-/ I'll have to think about it for a bit.
If we do add separate ansi/unicode options for _CLIPBOARD$()
then we'll have to consider how it all works cross-platform. Likely, the default will be different depending on the platform (Windows defaults to ANSI, Linux and Mac OS default to Unicode). For ANSI support for Linux and Mac OS we'll also need to consider if we're going to implement actual conversion of the Unicode characters (if a corresponding character exists), or maybe just strip them out.
Beyond that I think it's worth having the _CLIPBOARD$(UNICODE)
option return UTF-8 on all platforms to make it easier to work with. Linux and Mac OS already return that, on Windows it's a simple conversion we could do when copying the data into the string (Windows already has functions to do it).
Beyond that I think it's worth having the _CLIPBOARD$(UNICODE) option return UTF-8 on all platforms to make it easier to work with. Linux and Mac OS already return that, on Windows it's a simple conversion we could do when copying the data into the string (Windows already has functions to do it).
No objections, my only concern was to generally change the behavior to unicode/UTF-8, we should definitly keep the current (ANSI) function for the IDE and other existing programs and make unicode optional. How the new unicode stuff is finally handled can still be determined when finally working on the implementation.
Obviously too many uncertainties here, hence closed as "Not planned".
Add an optional parameter to
_CLIPBOARD$
(sub and function) to determine the textformat to use for clipboard operations CF_TEXT/CF_UNICODETEXT.eg.
Related discussion: https://qb64phoenix.com/forum/showthread.php?tid=1572