WebReflection / emoji-essential

An emoji dictionary directly from https://unicode.org/
ISC License
16 stars 2 forks source link

Sanitize names #1

Open jsejcksn opened 4 years ago

jsejcksn commented 4 years ago

The names of recently-added emoji include indicator symbols:

Example: https://github.com/WebReflection/emoji-essential/blob/6402a83ce9a9c071dfc93ce1ca2385eeb27f9052/index.js#L9

It looks like you're scraping these. Unicode offers a "machine-readable" text format if you're interested. You don't seem to be scraping keywords or other data only available in the html pages, so it might be a better fit for this project.

If you are interested in adding keywords to the dictionary, this page might interest you.


Off-topic, but mildly-related: If you prefer electron scraping to puppeteer, care to share why? I've never seen someone use electron for that purpose, and I'm interested.

rdela commented 3 months ago

Yes as @jsejcksn wrote the ⊛ character is not part of the name: “Recently-added emoji are marked by a ⊛ in the name” - Full Emoji List, v15.1

@WebReflection are you interested in implementing this change? or open to a PR? I am using https://github.com/WebReflection/emoji-short-name via the accessible-emoji component in Tugboat and the ⊛ in the names is a big drag here and defeats the whole purpose of trying to increase access, Cc @zachleat

UPDATE: I added a workaround in webcbed #5. (EDIT: and webcbed #6) Current code

UPDATE 2: Seems like Mac speech ignores ⊛ but I am sure some devices read “CIRCLED ASTERISK OPERATOR” out loud.