gitenberg-dev / gitberg

A command-line tool for interacting with books in git
https://www.gitenberg.org/
GNU General Public License v3.0
110 stars 21 forks source link

Create better repo names for non-ASCII titles #77

Closed tfmorris closed 8 years ago

tfmorris commented 10 years ago

Since Github apparently can't deal with non-ASCII characters in repo names, it doesn't make any sense to send them since it'll just create nonsense like https://github.com/GITenberg/------------------------------------------------------------13--------------------------_31434

Some suggestions:

sethwoodworth commented 10 years ago

Yes!

I want to minimize the number of times I change repo names, so I want to make sure we get this right in the next push.

Additional points:

sethwoodworth commented 10 years ago

I've spoken with github about unicode repo names and it's not likely to happen. Gitlab also doesn't support unicode names (but at least warns you and tells you what the allowed characters are)

tfmorris commented 10 years ago

I think I covered both those points in the original issue, but it's good to know what non-ASCII repo names aren't in the cards.

tfmorris commented 10 years ago

Oops, my original post makes less sense than intended because the XML encoded entity got swallowed by the renderer. Before the (sic) should appear [ampersand character]#13;

eshellman commented 8 years ago

https://github.com/gitenberg-dev/gitberg/blob/master/gitenberg/book.py#L103

For 31434, which is on of the worst, the new method generates something like: 0x39b0x3bf0x3b30x3bf0x3b9-0x3a60x3b90x3bb0x3b90x3c00x3c00x3b90x3ba0x3bf0x3b9-0x3980x3b50x3bf0__31854 which is not much better, but avoids the long string of underscores. I've not gone back to regenerate the badly named repos; can do it later if there's need. See also https://github.com/gitenberg-dev/gitberg/pull/94

tfmorris commented 8 years ago

The new code looks like an improvement, but is there a reason not to use the alternative title in this case, since it's available?

https://github.com/GITenberg/------------------------------------------------------------13--------------------------_31434/blob/master/pg31434.rdf#L39

If not available, and you get too few usable characters (threshold tbd) from a non-ISO Latin-1 script, another alternative might be to try something like https://pypi.python.org/pypi/transliterate/1.7.6

At least the description field stores the real title, so that can help the user out: https://github.com/search?q=user%3AGITenberg+31434&type=Repositories

eshellman commented 8 years ago

Using alternative title would indeed be better in this case, and would motivate redo-ing the repo..Opened a targeted issue #96