bbolli / tumblr-utils

Utilities for dealing with Tumblr blogs, Tumblr backup
GNU General Public License v3.0
667 stars 124 forks source link

Normalize tags before saving #159

Open WyohKnott opened 5 years ago

WyohKnott commented 5 years ago

I have many post tagged with real names in which sometimes i used upper case letters and sometimes I did not. Now the issue is that for tags index pages, these tags are differents: for example "Todd Hido" is different from "todd hido". Would it be possible to normalize every tag beforehand, by making them all lower case?

aspensmonster commented 5 years ago

Fuzzy matching might help in this case. You could have any tags with similar enough strings get grouped, and still expose the underlying tags themselves.

Though I can't think immediately of a decent way to test all combinations within the tag set. Tags can be quite diverse on tumblr.

thisismycontributionaccount commented 5 years ago

I was having a similar problem with "/" in tags as well as upper and lower case tags. So I modified the tumblr_backup.py to quote and lower case tags. Let me run a few more tests and I will try to add the code.