dreamyguy / gitlogg

💾 🧮 🤯 Parse the 'git log' of multiple repos to 'JSON'
MIT License
130 stars 27 forks source link

Replace `§` with a character that's nearly never used #3

Closed dreamyguy closed 8 years ago

dreamyguy commented 8 years ago

It turns out § was a poor choice to replace \n and \r, as its presence on strings created through user input does break the output.

While attempting to parse 456 repositories I came across many occurrences of the usual delimiters in the subject placeholder (commit message): |, ^, ~ and a few others. These are punctuation characters that rank pretty low on usage, according to http://www.wired.com/2013/08/the-rarity-of-the-ampersand/.

A possible solution would be to find a very seldom used character out of a reference like https://en.wikipedia.org/wiki/Letter_frequency

dreamyguy commented 8 years ago

I landed on ò, which according to that wiki page is the least used character among the listed languages (only occurs in 0.002% of Italian) .

If it fails someday, use something like ī or try one of the weird punctuations listed on https://en.wikipedia.org/wiki/Punctuation.