KSP-CKAN / NetKAN-Infra

NetKAN Infrastructure Repo
MIT License
3 stars 6 forks source link

Exclude underscores from SpaceDock identifiers #259

Closed HebaruSan closed 2 years ago

HebaruSan commented 2 years ago

Problem

KSP-CKAN/NetKAN#8987 was submitted with underscores in its identifier, which caused an inflation error when the validator script ran.

Cause

The SpaceDock Adder sanitizes identifiers by replacing a regexp \W+ with the empty string, which does not remove underscores:

https://docs.python.org/3/library/re.html#regular-expression-syntax

\w For Unicode (str) patterns: Matches Unicode word characters; this includes most characters that can be part of a word in any language, as well as numbers and the underscore. If the ASCII flag is used, only [a-zA-Z0-9_] is matched.

\W Matches any character which is not a word character. This is the opposite of \w. If the ASCII flag is used this becomes the > equivalent of [^a-zA-Z0-9_]. If the LOCALE flag is used, matches characters which are neither alphanumeric in the current > locale nor the underscore.

Changes

Now we eliminate both \W and _. (Yes, you can put \W inside a [] expression.)