arp242 / uni

Query the Unicode database from the commandline, with good support for emojis
MIT License
790 stars 19 forks source link

Possible to add support for detecting fonts containing the specified character? #1

Closed khughitt closed 4 years ago

khughitt commented 4 years ago

This might be outside the scope of what you have in mind, and there may not be sufficient libraries in Go to make this something that is possible to implement in a reasonable amount of time (and it would probably only work on linux / require different solutions for different platforms), but... It would be really nice if it were possible to also detect fonts that contain the queried character, e.g. something like the perl example from this Polybar wiki:

use strict;
use warnings;
use Font::FreeType;
my ($char) = @ARGV;
foreach my $font_def (`fc-list`) {
    my ($file, $name) = split(/: /, $font_def);
    my $face = Font::FreeType->new->face($file);
    my $glyph = $face->glyph_from_char($char);
    if ($glyph) {
        print $font_def;
    }
}

Just a thought..

arp242 commented 4 years ago

My first thought would be to say that this is out-of-scope for uni. I think a "query font information from the commandline" would be a neat tool, but I'm not sure if it should be integrated in here. While Unicode and fonts are related, they're quite different domains.

With a hypothetical glyph tool you could do something like:

$ uni -q p U+2042 | glyph

I don't know if something like that exists, but it shouldn't be too hard to modify your Perl example to read from stdin.

You can get the same information with fc-cat by the way, which includes the charset ranges.

Maybe I'll write the glyph tool, or maybe I'll change my mind and add it uni anyway ... I'll have a think about it. We'll need configurable columns first anyway.

khughitt commented 4 years ago

@arp242 That is totally reasonable. Having that sort of functionality would be very useful for tasks relating to configuring polybar, powerline, etc. applications which tend to use unicode glyphs that are often highly font-specific, however, that doesn't mean that uni necessarily the best place to support such a feature.

Thanks for the suggestion regarding fc-cat! That is one of the fontconfig tools I hadn't looked at before.

Sounds good though -- I'll leave the issue open for now.

arp242 commented 4 years ago

I looked a bit at this today, and it's kinda hard to get a good overview of this as far as I can tell. One of the bigger problems is that I can't figure out how to determine if a font supports a particular emoji (only codepoints, but emojis consist of more than one codepoint).

Adding a font parsing library would be too much IMO, not in the least because many of them only deal with one particular font, so to support TrueType, OpenType, Type1, raster fonts, etc. and it would require adding list of libraries; I also can't really figure out how to get this information from reading the docs and many libraries for Go seem unfinished and unmaintained.

In short, the entire thing would be a project into itself. It's extremely unlikely I'll ever do this, so I figure I might as well close this.

I did manage to cook up a little script for fc-cat (but it doesn't deal with emojis); not very polished but I figured I might as well post it here. If someone can figure out how to get the emoji data from this too then I don't mind adding it to the repo, but I'd rather not add something half-working.

#!/bin/sh

set -euC

font="$(fc-cat 2>&1 | grep -Ei '^"dejavusansmono.ttf[^"]*?"')"
charset="$(printf "$font" | grep -Eo 'charset=[a-f0-9 -]+')"
charset="${charset#charset=}"

char=2e22
#char=2e1f
#char=6e1f

char_dec=$(printf '%d' "0x$char")
IFS=" "
for c in $charset; do
    case "$c" in
        *-*)
            start="$(printf '%d' 0x${c%-*})"
            end="$(printf '%d' 0x${c#*-})"
            if [ $char_dec -ge $start ] && [ $char_dec -le $end ]; then
                echo "U+$char is supported by this font"
                exit 0
            fi
            ;;
        *)
            if [ "$char" = "$c" ]; then
                echo "U+$char is supported by this font"
                exit 0
            fi
            ;;
    esac
done

echo "U+$char is NOT supported by this font"
exit 1