unicode-normalization Search Results

1000+ results
for unicode-normalization

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

open-i18n/rust-unic #20

[normal] Implement isNFC()

Many APIs need to check input for being NFC, and it's something that can be done pretty fast for a majority of cases. Ref: http://www.unicode.org/reports/tr15/#Detecting_Normalization_Forms

behnam updated 5 years ago
10
mozilla/translations #26

Handle soft hyphens with custom normalization tables

Ulrich: >The SentencePiece tokenizer should probably be trained with a custom normalization table (see the SentencePiece documentation) that removes soft hyphens in addition to the existing normaliza…

eu9ene updated 3 weeks ago
4
WICG/file-system-access #115

Handling of non-Unicode handle names

https://wicg.github.io/native-file-system/#dom-filesystemhandle-name is a USVString and can only represent a sequence of Unicode scalar values. But file systems don't respect those rules: - On Linux …

foolip updated 4 years ago
1
jsvine/pdfplumber #905

Add `normalize_unicode=False/True` parameter to text extract…

Per @petermr's suggestion in https://github.com/jsvine/pdfplumber/discussions/904#discussioncomment-6149469, I think it's a good idea to add such a parameter/option, using `unicodedata.normalize(...)`…

jsvine updated 3 months ago
5
ocrmypdf/OCRmyPDF #1282

[Feature]: Choose between NFKC and NFC normalization for Uni…

### Describe the proposed feature `HocrTransform.normalize_text` normalizes text using the NFKC[^1] compatibilty algorithm. https://github.com/ocrmypdf/OCRmyPDF/blob/6895c2d70fa03ec4d57e779110e07…

sfllaw updated 7 months ago
5
glic3rinu/passlib #24

support SASLprep in CryptContext

``` Passlib currently takes in whatever unicode sequence is offered, and hashes it. However, there unicode normalization issues, non-printing code points (eg SHY) that should be discarded, and many …

GoogleCodeExporter updated 9 years ago
2
danbooru/danbooru #4506

Bad normalization for unicode characters in "Other names" fi…

`(´・ω・`)` is incorrectly normalized - it results in the following: ![image](https://user-images.githubusercontent.com/12946050/84580844-3bbe6d00-addb-11ea-875b-22ef8767fb52.png) For some reason it …

nonamethanks updated 2 years ago
1
helix-editor/nucleo #58

Grapheme handling issue with `\r\n`

There seems to be a grapheme handling ambiguity for strings containing the "windows-style newline" `\r\n`? Since `\r\n` is treated as a single grapheme by the Unicode segmentation crate, the highli…

alexrutar updated 7 hours ago
4
toml-lang/toml #966

Clarify that key uniqueness depends only on binary represent…

I've just learned about #891 and I'm excited to see that the TOML specification is improving Unicode support. Do I understand right that this changeset makes no recommendations for implementers whe…

SnoopJ updated 5 months ago
56
ahmad1702/upload-thing-dark-mode-free #8

Petition spammed with hateful language again

Today the petition was spammed yet again with hateful language with 2,5K petition items added by the same person, as it appears. They've circumvented validation using unicode characters. Perhaps th…

bvpav updated 7 months ago
1

上一页 1...10 11 12 13 14 15 16...100 下一页

1000+ results for unicode-normalization

1000+ results
for unicode-normalization