Closed neilhwatson closed 8 years ago
I think it'd be more sane if they were just all treated as lower-case, at least for the purposes of finding duplicates. Case-preservation seems difficult though: Would older tags take precedence, or newer? It might be less difficult than the possibility of breaking links which will likely happen no matter which way this ticket goes.
I'm inclined to agree that they should be lower-case, and that should be the default. Unless someone convinces me right away that it needs to be possible to disable it, we'll likely hold off on that part.
Current behaviour is most obvious in the sorting of very large tag clouds; all-lowercase is fine, or should that be foldcase (always fun with Unicode)?
Yes, it seems like we'd want unicode fold-case to test for duplication, which appears to be available in Perl 5.16. Prior to that, we can just use lc
.
The very large tag cloud management might need to be a different thread, but this issue should at least try not to break current behavior. So it will likely need to be something like:
my %seen_tags; # hash of fc $tag => $tag so that the first seen tag determines the case
for my $p ( @posts ) {
$seen_tags{ fc $_ } ||= $_ for @{ $p->tags };
}
my @all_tags = sort values %seen_tags;
This is a breaking change: The URLs for all tags are now lower-cased, so any links to tag pages made before this will be broken.
It noticed the tag cloud is case sensitive. Perhaps they should all be lower cased, this could be an option?