joncampbell123 / dosbox-x

DOSBox-X fork of the DOSBox project
GNU General Public License v2.0
2.81k stars 383 forks source link

Code page / TT font / ? for MacOS environment #2714

Open cszstl opened 3 years ago

cszstl commented 3 years ago

The Wiki guide pages, FAQs &c related to code pages, fonts, etc., are a curious melange of much too much information and seriously inadequate information. The number and variety of configuration options is overwhelming, while at the same time the lists of possible values for many of those options are incomplete and/or poorly explained. My requirement is really very simple -- I need to be able to work with plain text files in both the DOS environment and the MacOS environment, and to have all of the various characters in the upper half of the 8-bit character codes appear the same in both on-screen environments, i.e., accented Latin as they appear in ISO 8559-1. However, the default code page/font/whatever for DOSBox-X displays instead all of the single-line and double-line box-drawing characters of the 25-year-old PC-AT hardware that I really want to be able to abandon. I don't need to tinker with printer configurations, either real or virtual; I'm not worried about keyboard configurations; and I don't want to spend weeks or months of experimentation trying to solve a problem that someone else must already have solved.

What code page / country code / TTF font settings should I use to make DOSBox-X display the same character glyphs that the host MacOS system does for all defined character codes?

emendelson commented 3 years ago

DOS never had a code page that exactly corresponded to a the symbol set used on a Mac, and because DOS only supports single-byte characters, it's physically impossible for DOSBox-X to display all the characters that can appear in a Mac text file. Presumably you could modify the program code to support codepage 1252, and that would come close to what I think you want, but the results wouldn't be perfect.

The vDos emulator supports codepage 1252 (except for five missing characters), and you could run vDos under Wine on a Mac, but it's much slower and more awkward than a native Mac application like DOSBox-X.

Or you could request an enhancement to support codepage 1252, but you might get better results if you ask in a friendly tone of voice.

EDIT: Or the fastest way to get this would be to create a 1252.TXT file to match the existing files for other code pages. Download the code, and you'll see how it's done. If you do this yourself, you'll have what you want a lot faster than you would if you waited for other people to do the work for you.

Wengier commented 3 years ago

@emendelson Thanks for the information about the said code page. I think such code pages (1250-1258) can be added for the TTF output (done), as adding them only involves relatively trivial amount of work anyway (and won't negatively affect the output). These code pages will never be recommended, but can be switched to for users who really want to use them.

emendelson commented 3 years ago

@Wengier - That is excellent; I had no idea it would be so easy. Is it equally easy to add MacRoman and ISO-8859-1 using the attached files, or does the current code only work with DOS/Windows code pages (from the same Unicode ftp site as the Windows code pages):

ROMAN.TXT

8859-1.TXT

Also, a suggestion: Can the program look for similar files in a codepage folder (by default inside the folder with the application) and load any that it finds? That way, users could work with custom codepages without adding anything to the codebase.

Wengier commented 3 years ago

@emendelson The code pages 125x are standardized and have fixed code page numbers. Are these code pages also have fixed code page numbers? If so, then theoretically it should be easy to add them too.

Note that code pages are directly embedded into the source code and then compiled. While it is possible to have custom codepages, they need to have a standard format so that it is easy for the program to parse the code page file. I hope these code pages all have the same code page format for parsing, but they should also have code page numbers.

emendelson commented 3 years ago

@Wengier - No, those codepages don't have standard numbers, so please don't spend time working with them. Sometime this coming week, I'll experiment with compiling them and will see whether they work in their current form. If not, then perhaps the original poster might consider doing the work of modifying them so that they work correctly.

cszstl commented 3 years ago

First, I apologize for any whininess that might appear in my OP.  I should not have allowed my frustration to overflow like that. Second, I don't need all possible Mac characters -- only some of those between decimal 160 and decimal 255.  All of those are single-byte, so DOS should theoretically be able to handle them.  If a few don't exactly match, I could live with that, as long as I could see the majority of the accented Latin glyphs in that range.  Codepage 1252 appears to be Windows-specific, and I do not use Windows; certainly it is vastly different from Mac Roman, and therefore not applicable in my context.

Third, while many aspects of the Wiki pages, the Guides and the configuration file comments are quite comprehensive, I cannot find a list of supported code pages, nor a list of supported country codes, nor a suggested source for potentially useful TTF files.  If those things existed, they could provide helpful boundaries for a zone of exploration.

Finally, I have verified that the default code page for DOSBox-X exactly matches the display on my ancient PC for all printable characters, and that I can use a PC keyboard with the Alt-# data entry method to generate special characters exactly as I used to do before.  So if I can't find an almost-Mac-like supported code page (maybe 1275?), then I can limp along with that ancient kludge.  After all, it's worked for 25 years, even if it is painfully tedious.

On Saturday, July 24, 2021, 10:26:23 PM CDT, emendelson ***@***.***> wrote:  

DOS never had a code page that exactly corresponded to a the symbol set used on a Mac, and because DOS only supports single-byte characters, it's physically impossible for DOSBox to display all the characters that can appear in a Mac text file. Presumably you could modify the program code to support codepage 1252, and that would come close to what I think you want, but the results wouldn't be perfect.

The vDos emulator supports codepage 1252 (except for five missing characters), and you could run vDos under Wine on a Mac, but it's much slower and more awkward than DOSBox.

Or you could request an enhancement to support codepage 1252, but you might want get better results if you ask in a friendly tone of voice.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

Wengier commented 3 years ago

@emendelson and @cszstl I decided to solve this issue by implementing a config option "customcodepage" in [dos] section to specify a custom SBCS code page (number & file), such as ROMAN.TXT as mentioned above, e.g.

customcodepage = 1000,ROMAN.TXT

Which will set ROMAN.TXT as custom code page 1000. Use command CHCP 1000 to switch to it (or via country option). Please check out the new code. Hope this help.

emendelson commented 3 years ago

@Wengier - I was writing a post listing the steps for integrating the two files that I posted, when I saw your message appear. That's obviously the best way to go. It will also be useful for some very old European applications that used font overlay files for displaying text.

One question: what is the default location of the mapping .TXT file? The same folder as DOSBox-X?

Thank you for implementing this!

Wengier commented 3 years ago

@cszstl For a list of supported code pages, please type CHCP /?. For a list of supported country codes, look at the "Regional settings in DOSBox-X" Wiki page below:

https://dosbox-x.com/wiki/Guide%3ARegional-settings-in-DOSBox%E2%80%90X

For potentially useful TTF files, the TTF Wiki page already lists some of them. I personally suggest Sarasa Gothic TTF fonts, which is a set of free TTF fonts that look very nice, covering both SBCS and DBCS characters.

emendelson commented 3 years ago

I forgot to mention that the new 1250-1258 include files weren't in the VS2015 project when I downloaded a few hours ago. Maybe they're there now?

Wengier commented 3 years ago

@emendelson Yes, the default location is the DOSBox-X working directory or program directory. You can also specify an absolute path to the code page file.

Moreover, the VS2015 project has been updated too (even though not absolutely necessary).

emendelson commented 3 years ago

@Wengier - Yes, this seems to work correctly, as in the screen shot below, using Consolas as the font (I haven't experimented with other fonts as you suggested).

Question: could this be extended to support more than one custom code page, as in:

customcodepage = 8859,8859-1.TXT
customcodepage1 = 9000,ROMAN.TXT
etc.

I suppose that the first customcodepage entry would be customcodepage0 internally, and I hope that might make this relatively easy to implement.

EDIT: Replaced screen shot with a clearer example, using MacRoman:

Capture

Wengier commented 3 years ago

@emendelson For additional custom code page(s), I think it is better to use CHCP command to load them via an optional additional parameter instead of adding more very similar config options. Try new code and type a command like CHCP 9000 ROMAN.TXT. Hope this helps.

emendelson commented 3 years ago

@Wengier I will have to wait for a few hours or days before I can build the code, but this definitely sounds like a better idea. I am looking forward to trying it out. Thank you again!

cszstl commented 3 years ago

Thanks for the reminder about the list of supported country codes.  While I had seen it before, its usefulness hadn't registered at the time.  Code 44 (UK) supplies my preferred formats for date and time, though I am located in the USA. It would be helpful if the first sentence of your message could be incorporated into the CHCP description on the Guide page "DOSBox‐X’s-Supported-Commands".  It would also be helpful to include there a Note containing the message that can be produced when changing the code page is attempted:  "Changing code page is only supported for the TrueType font output."  Incidentally, that Guide page is currently titled "Untitled".

Never having worked with fonts before, I would appreciate a suggestion as to where to obtain them in a ready-to-use form.

On Sunday, July 25, 2021, 11:11:31 AM CDT, Wengier ***@***.***> wrote:  

@cszstl For a list of supported code pages, please type CHCP /?. For a list of supported country codes, look at the "Regional settings in DOSBox-X" Wiki page below:

https://dosbox-x.com/wiki/Guide%3ARegional-settings-in-DOSBox%E2%80%90X

For potentially useful TTF files, the TTF Wiki page already lists some of them. I personally suggest Sarasa Gothic TTF fonts, which is a set of free TTF fonts that look very nice, covering both SBCS and DBCS characters.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

emendelson commented 3 years ago

For the Sarasa Gothic fonts, search "Sarasa Gothic" on Google; then go to the "Releases" page on the first link.

emendelson commented 3 years ago

@Wengier - I was able to build current code on a Mac and this works exactly as described. Thank you. I needed to enter an absolute path for the codepage .TXT file (my AppleScript app launches the binary directly; it doesn't run the enclosing DOSBox.app bundle). It's possible that a relative path would work, but I couldn't make it work, and it's not clear to me what the relative path should be relative to.

And this suggests other possible enhancements - and I know that I am asking you to do even more work, but I hope that the second point below may be useful in many contexts:

  1. Could the CHCP command find a path relative to the DOSBox binary (or, in Windows, the executable), something like:

chip 1000 ../../../ROMAN.TXT

2 Or (and this is something that may be useful in other contexts) could DOSBox-X support an environmental variable named something like EXEFLDR, which would return the path of the enclosing folder of the executable? That way, you could use a command like this:

chip 1000 %EXEFLDR%/../../../ROMAN.TXT

You will probably find a better solution than either of these, but these are the first ones that come to mind.

Wengier commented 3 years ago

@emendelson I think CHCP command will indeed try to read the path relative to the DOSBox-X binary, and the path relative to the DOSBox-X working directory. Have you tried it yourself yet? If it does not work then I guess something is wrong with it.

emendelson commented 3 years ago

@Wengier - Yes, the path relative to the binary works perfectly. Thank you again!

Wengier commented 3 years ago

@cszstl I added a "note" to CHCP command in Supported Command in the draft Wiki, which will appear in the main Wiki later.

cszstl commented 3 years ago

That process yields a 381MB file named "sarasa-gothic-ttf-0.32.14.7z"; but Mac OS X offers no clue about what to do with it, and the Build process outlined on the parent page is not something that I want to get into.

I did eventually find that OS X stores system fonts in two locations (Library/Fonts and System/Library/Fonts); both contain a number of .ttf files, but they are not discoverable with Finder, which is why I didn't see them earlier.  (User-specific fonts can be stored in yet another place, ~/Library/Fonts, but I don't have anything there.)  There is a Mac-supplied application named Font Book which enables exploring and managing fonts, but it lumps all the fonts from both directories together in a single list, and does not provide any convenient way to determine which fonts are in which directory.  This is highly significant, because DOSBox-X only sees what is in Library/Fonts; is does NOT see what is in System/Library/Fonts.  While the Configuration tool will accept any font name, if that font is in System/Library/Fonts then when DOSBox-X is restarted it will display a Warning message:  "Could not load font file: .ttf" and then will operate with the last-specified valid font file.

Actually, I am abandoning the search for a Mac-equivalent code page, because I have realized that I need the box-drawing characters of CP 437 for the DOS-based file management utility that I use (Stereo Shell).  I will still continue to explore TTF options, simply to make the ASCII characters more readable.  And with a PC keyboard plugged into the Mac, I can continue to use the ancient Alt-# method of entering special characters when necessary. Thanks for everyone's help!

On Sunday, July 25, 2021, 6:35:26 PM CDT, emendelson ***@***.***> wrote:  

For the Sarasa Gothic fonts, search "Sarasa Gothic" on Google go to the "Releases" page on the first link.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

Wengier commented 3 years ago

In the updated code DOSBox-X will try to find fonts from System/Library/Fonts and ~/Library/Fonts too. Hope this helps.