Open HansOlsson opened 2 years ago
Can this be seen as follow-up for #3789?
I would say this issue represents the minimum, whereas #3789 extends this to other cases.
As I see it, it means that the the MSL string handling functions are actually fine as they are as long as the targeted Modelica language version doesn't exceed 3.5, but that one break these functions by switching to a newer MLS version without making sure that the functions operate on something more sensible than bytes.
Fortunately, the string handling functions could be updated for UTF-8 already today, as they would remain valid also under the constraint that they only operate on ASCII strings. That is, they would remain compatible with the current target MLS 3.4, as well as both 3.5 and future versions. A minor concern would be that making them UTF-8 ready would encourage invalid use as long as the targeted MLS version doesn't exceed 3.5.
I believe this part hasn't been done yet, but I'm happy to be proven wrong:
File handling routines should accept file names encoded using UTF-8.
Reopening.
I believe this part hasn't been done yet, but I'm happy to be proven wrong:
File handling routines should accept file names encoded using UTF-8.
Reopening.
As far as I understand it will likely work without changes for *nix-variants.
For Windows there are two options:
For the manifest, it would then only work for updated OSes, and only if the tool compiling the executable sets that flag. MultiByteToWideChar is pretty simple to use.
@HansOlsson what do you propose as the next work plan?
@HansOlsson @MartinOtter second part of the issue which is unaddressed, would you please look into it?
I have not enough knowledge to have an opinion or contribute here. @HansOlsson, @sjoelund please give advice how to continue/make a pull request
Based on allowing Unicode strings in Modelica Language MO#3079
We should ideally also support that in MSL, as far as I can see there are two major issues:
Notes:
(unsigned char)
forisalpha
etc, so it should work.Modelica.Utilities.Strings.scanToken("\"€2ÅÄÖ\"");
does actually work already.