themashcodee / slack-blocks-to-jsx

A library for converting Slack blocks to JSX, which can be used with React.
https://www.npmjs.com/package/slack-blocks-to-jsx
11 stars 1 forks source link

Support for emoji skin tone variations #16

Closed StephenTangCook closed 2 months ago

StephenTangCook commented 3 months ago

Emojis can have modifier sequences, most commonly used for skin-tone variables. There can be up to six color variations (with the first one assumed to be the default). Here's an example for thumbs-up:

  "thumbsup": "👍",
  "thumbsup::skin-tone-2": "👍🏻",
  "thumbsup::skin-tone-3": "👍🏼",
  "thumbsup::skin-tone-4": "👍🏽",
  "thumbsup::skin-tone-5": "👍🏾",
  "thumbsup::skin-tone-6": "👍🏿",

Slack supports changing your default skin tone for emojis, so a full emoji with modifier sequence could appear in a message.

StephenTangCook commented 3 months ago

In one of my projects we have a script that just creates a static mapping of all possible emoji variation shortnames from emoji-datasource (which gets its data from emoji-data, the same emoji library Slack uses). You're free to reuse the logic if it helps!

Essentially we have the scripts:

  "scripts": {
    "build": "tsup src/index.ts",
    "copy-emoji-data": "cp node_modules/emoji-datasource/emoji.json dist/emoji.json",
    "create-emoji-mapping": "pnpm copy-emoji-data && tsup src/create-emoji-mapping.ts && ts-node dist/create-emoji-mapping.cjs"
  },

And the create-emoji-mapping.ts:

import * as emojiConvertor from 'emoji-js';
import * as fs from 'fs-extra';
import * as path from 'path';

type EmojiEntry = {
  name: string;
  unified: string;
  non_qualified: string | null;
  docomo: string | null;
  au: string | null;
  softbank: string | null;
  google: string | null;
  image: string;
  sheet_x: number;
  sheet_y: number;
  short_name: string;
  short_names: string[];
  text: string | null;
  texts: string[] | null;
  category: string;
  subcategory: string;
  sort_order: number;
  added_in: string;
  has_img_apple: boolean;
  has_img_google: boolean;
  has_img_twitter: boolean;
  has_img_facebook: boolean;
  skin_variations?: Record<string, SkinVariation>;
};

type SkinVariation = {
  unified: string;
  non_qualified: string | null;
  image: string;
  sheet_x: number;
  sheet_y: number;
  added_in: string;
  has_img_apple: boolean;
  has_img_google: boolean;
  has_img_twitter: boolean;
  has_img_facebook: boolean;
};

type EmojiData = EmojiEntry[];

type EmojiShortnameMapping = Record<string, string>;

console.log('Creating emoji short_name mapping...');

// Get the emoji source data
const emojiJsonPath = path.resolve('dist', 'emoji.json');
console.log(`Loading emoji data file at '${emojiJsonPath}'...`);
const emojiData: EmojiData = fs.readJsonSync(emojiJsonPath);

// Get the emoji converter
const emojiConverter = new emojiConvertor.EmojiConvertor();
emojiConverter.replace_mode = 'unified';

/**
 * Converts string represntation of unified characters to unified characters (e.g. "1F3FB-1F3FC" => \u{1F3FB}\u{1F3FC}
 * @param unified - the unified string representation (e.g. "1F3FB-1F3FC")
 * @returns the unified character (e.g. \u{1F3FB}\u{1F3FC}) as a string
 */
function unifiedStringToCodePoint(unified: string): string {
  return unified
    .split('-')
    .map((unifiedSegment) => String.fromCodePoint(parseInt(unifiedSegment, 16)))
    .join('');
}

/**
 * Converts a code point to a pretty stringified version (e.g. "\u{1F3FB}\u{1F3FC}")
 * @param codePoint - the code point to convert
 * @returns the pretty stringified version
 */
function codePointToPrettyString(codePoint: string): string {
  return codePoint
    .split('')
    .map((segment) => {
      const codePoint = segment.codePointAt(0);
      if (codePoint !== undefined) {
        const str = codePoint.toString(16).toUpperCase();
        return `\\u{${str}}`;
      }
      return segment;
    })
    .join('');
}

// Get the skintone variations (e.g. "1F3FE" => "skin-tone-5" )
const skintoneVariationsMapping: Record<string, string> = {};
emojiData
  .filter(
    (emojiEntry: EmojiEntry) =>
      emojiEntry.category && emojiEntry.subcategory === 'skin-tone'
  )
  .map((skintoneEntry: EmojiEntry) => {
    skintoneVariationsMapping[skintoneEntry.unified] = skintoneEntry.short_name;
  });

// Iterate through the emoji data and create a mapping
const emojiShortnameMapping: EmojiShortnameMapping = {};
emojiData.forEach((emojiEntry: EmojiEntry) => {
  emojiEntry.short_names.forEach((shortname: string) => {
    // start with default shortname (no modifier) and add variations as needed
    const emojiShortnameToUnifiedMap: Record<string, string> = {
      [shortname]: unifiedStringToCodePoint(emojiEntry.unified)
    };

    // add any skintone variations
    if (emojiEntry.skin_variations) {
      const skinVariations: string[] = Object.keys(emojiEntry.skin_variations);
      skinVariations.map((skinVariation: string) => {
        // NOTE: The skin-tone variation can include multiple hyphen-separated values
        // which we need to convert to a single shortname variation
        // e.g. "1F3FB-1F3FC" => "::skin-tone-2-3"
        const skinVariationShortname = skinVariation
          .split('-')
          .map((skinVariationUnifiedSegment, index) => {
            const skinVariationShortname =
              skintoneVariationsMapping[skinVariationUnifiedSegment]; // "1F3FB" => "skin-tone-2"
            if (index > 0) {
              // for multiple segments we'll remove the prefix, e.g. "skin-tone-2" => "-2"
              return skinVariationShortname.replace('skin-tone', '');
            } else {
              return skinVariationShortname;
            }
          })
          .join('');

        const shortnameVariant = `${shortname}::${skinVariationShortname}`;

        const variationInfo = emojiEntry.skin_variations?.[skinVariation];
        if (variationInfo?.unified) {
          // console.log('variationInfo.unified', variationInfo.unified);
          emojiShortnameToUnifiedMap[shortnameVariant] =
            unifiedStringToCodePoint(variationInfo.unified);
        }
      });
    }

    // look up the emoji for each shortname variation
    Object.entries(emojiShortnameToUnifiedMap).map(
      ([shortname, emojiUnified]) => {
        try {
          const emoji = emojiConverter.replace_unified(emojiUnified);
          emojiShortnameMapping[shortname] = emoji;
        } catch (e) {
          // TODO (known bug): this fails for certain skintone variations
          // see https://github.com/iamcal/js-emoji/issues/191
          console.error(
            `Error converting emoji with shortname '${shortname}' and unified '${codePointToPrettyString(emojiUnified)}': ${e}`
          );
        }
        return;
      }
    );
  });
});

// Save to output file
const outputFile = 'emoji-shortcode-mapping.json';
const outputFilePretty = 'emoji-shortcode-mapping_pretty.json';
fs.writeJsonSync('src/' + outputFile, emojiShortnameMapping);
fs.writeJsonSync('src/' + outputFilePretty, emojiShortnameMapping, {
  spaces: 2
});

console.log(
  `Emoji shortcode mapping (${
    Object.keys(emojiShortnameMapping).length
  } entries) saved to ${outputFile}`
);
StephenTangCook commented 2 months ago

Note I just updated the script content with some skin tone variation bug fixes, in particular when there are multiple variations for compound emojis (e.g. two_women_holding_hands::skin-tone-2-5: "👩🏻‍🤝‍👩🏾").

Here's the output file if you just want to use it :) emoji-shortcode-mapping_pretty.json emoji-shortcode-mapping.json

themashcodee commented 2 months ago

@StephenTangCook Thank you so much for the file, this is super helpful.

themashcodee commented 2 months ago

@StephenTangCook resolved in v0.3.8

StephenTangCook commented 2 months ago

Awesome! I swear I'll open source the emoji list conversation in an npm one day when I find the time! 😭