lewdlime / abcm2ps

abcm2ps is a command line program which converts ABC to music sheet in PostScript or SVG format. It is an extension of abc2ps which may handle many voices per staff. abcm2ps is Copyright © 2014-2016 Jean-Francois Moine.
http://moinejf.free.fr/
GNU General Public License v3.0
82 stars 31 forks source link

Cyryllic characters are not displayed #117

Closed grzegorzgrzegorz closed 6 months ago

grzegorzgrzegorz commented 6 months ago

I cannot get cyryllic characters to be visible in output ps file. My sample file:

X:1
%%titlefont Helvetica 20
T: TИXOE OЗEPO
M: 6/8
H: rewritten tune
%%score (RH1 RH2) | LH1
V: RH1 clef=treble stem=down
V: RH2 clef=treble stem=up
V: LH1 clef=treble dyn=up
K:Bb
%%measurenb 0
%%measurebox
%%linebreak

The second letter of each word in the title is unicode character (u\0418 and u\0417 respectively) taken from Windows character map. The rest is just typed in using keyboard. The result is:

image

I think I do not understand how fonts are used in abcm2ps - are they sourced from Windows font dir or some other place? Anyway, by no mean am I able to see full title in postscript file. Could you explain if this is something on my side or on yours?

moinejf commented 6 months ago

There are 2 problems.

First, PostScript is an old language that was defined before unicode and utf-8. Handling unicode characters is not easy because PostScript uses either a 256 characters table or a associative table based on the name of the glyphs. The names of the glyphs are in the font files, but they may vary form font to font. As an example, the character \u0417 may have 3 names: afii10025, Zecyrillic or Ze. In my system, a common font includes only the first two names.

abcm2ps does not include all the unicode characters. It knows only the first 224 characters. For other characters, abcm2ps uses a dynamic table. This table is loaded by the command %%glyph. So, for the rendering of the characters \u0417 and \u418, you must set: %%glyph 417 Zecyrillic %%glyph 418 Iicyrillic

Finally, there was a bug, a very old one: the characters between \u0400 and \u07ff did not enter in the table. This is fixed by the commit b09f9c2.

grzegorzgrzegorz commented 6 months ago

Hello, thank you for so fast response and fixing the code. I was struggling with building abcm2ps on Windows but I failed. Cosmopolitan is too steep for me and I was able to run the build without errors using CYGWIN but the resulting exe is not working. I come from Java world and never did any c/c++ so it is bit painful for me. Anyway, I was able to build on Linux and I can run it from there. However, when I provide this abc file:

X:1
%%titlefont Times-Roman 24
T: TИXOE OЗEPO T\u0417XOE O\u0418EPO
M: 6/8
H: rewritten tune
%%score {(RH1 RH2) | (LH1 LH2)}
V: RH1 clef=treble stem=auto
V: RH2 clef=treble stem=down voicecolor blue
V: LH1 clef=treble dyn=up 
V: LH2 clef=treble stem=up voicecolor red
K:Bb
%%measurenb 0
%%measurebox
%%linebreak
%%voicecombine 0
%%slurheight 1
%%annotationfont Bookman-DemiBold 14
%%setfont-1 serifBoldItalic 14
%%setfont-2 Times-BoldItalic 16
%%glyph 417 Zecyrillic
%%glyph 418 Iicyrillic

and use this command: ./abcm2ps file.abc -O file.ps the resulting file doesn't have cyryllic letters but empty spaces as before for both occurences of each problematic letter. Could you please tell me if I am missing something ?

moinejf commented 6 months ago

Hi,

abcm2ps generates the music as it finds it in the ABC flow. Especially, the tune header is generated when finding the first K: and the music itself is generated when finding the end of tune (empty line).

In your example, you put the cyrillic glyph definitions after the first K:, so, these glyphs are not known when the tune header is generated. You may see that the cyrillic character rendering is working by putting a line as "^TИXOE OЗEPO"c4| at the end of your example.

Anyway, such definitions must be global, either at the top of the ABC files, or, better, in the file default.fmt.

grzegorzgrzegorz commented 6 months ago

Thanks for the tip, it works now as expected after moving definitions up. It's a great tool. I have last question: I was only able to find glyph names for other letters here: https://github.com/adobe-type-tools/agl-aglfn/blob/master/glyphlist.txt so I suppose I can typeset any or most unicode characters. The question is: where are the font files abcm2ps is using? It is not using system fonts I assume, there are no letters inside abc2svg.ttf file as well and I cannot find any more font files out there. Also, is it possible to draw glyph names from those mysterious font files ?

moinejf commented 6 months ago

abcm2ps generates PostScript files with characters and music glyphs.

The music glyphs can be either internal (PostScript functions) or external. abc2svg.ttf is a small subset of the SMuFL music glyphs. There are many SMuFL fonts as Bravura.

Postscript is a programming language that has graphical functions. The Postscript files, as the ones that are generated by abcm2ps, must be run by a PostScript interpreter as ghostscript. The result of the interpreter is a graphical file (.tiff, .png, .pdf...) that may be directy displayed on a screen.

The PostScript files contain information about what fonts are to be used. For instance, by default, abcm2ps wants the tune titles to be displayed with the font Times-Roman. It is the job of the PostScript interpreter to search in the host system where there is a font compatible with the asked font.

Also, note that the characters you see on your screen have been generated from a system or user font by the program you are running. In a computer, the fonts are stored in well-known directories (/usr/share/fonts in unix-like systems). The size the fonts depends on which subset of the unicode characters they contain. Usually, this subset is chosen according to the languages of the countries.

grzegorzgrzegorz commented 6 months ago

Thanks for so detailed explanation.