Howdju / howdju

Monorepo for the Howdju crowdsourced fact checking and summarization platform
https://www.howdju.com
GNU Affero General Public License v3.0
5 stars 2 forks source link

Improve text normalization for non-ASCII #442

Open carlgieringer opened 1 year ago

carlgieringer commented 1 year ago

To achieve this we'd probably have to do something like the following:

carlgieringer commented 1 year ago

We could probably have different normalizations for quotes (which need to be more literal and so need to account for, say, emoji) and propositions (which tend to be more generic, and probably should not differ based on, say, emoji.)