TL;DR: This also is a permanent call for ideas (both from logic of link the data and some interface to show the tags), but if you want to take ideas back to the main repository, consider the non-English tags and also how to make easier for people contribute. The license (public domain) allows you to do your on fork/new project even without citing the original one; and we're ok if you do this.
Related:
Revision of categorization of topics #2
Very basic usage of Esperanto Language and minimal optimization for machine translation #3
Linked data of A/IS Ethics Tags is hard to make it simpler. This pinned issue on this repository is to make a single place to explain this (and also be referenced on other issues) for someone who do not want to read everything. Extra information could be added later, but for now:
Challenges for any project using only one language
A single language can have more than one tag meaning different concepts
This is specially true for short tags
Challenges for a project to link tags from multiple languages
One tag in a language can mean also the same concept on another language
This can be perceived as good (or at least not the worst case), since sometimes someone would want to search for content on different languages
This sometimes could lead people to "block" users or content producers, even when they are making relevant content
But, to be fair, we cant blame or force user to avoid use these tags.
Also, when a tag have the same meaning in a different language, even without automatic translation aid, the user maybe would be interested on that type of content
One tag in a language can also means a different concept of A/IS Ethics on another language
This is one of the main reasons to make hard to make simpler
Challenges related to alphabets of the languages used on the tags
Note: considered (at time of this post) only Facebook | GitHub | Instagram | LinkedIn | Medium | Pinterest | Reddit | Twitter | Youtube platforms, and also not fully tested each one
The ideal scenario
Most platforms will only play very nice on the most perfect scenario with only numbers 0-9 and lowercase letters (a-z)
Note: not fully tested, but some platforms are likely to not work perfect if a tag start with a number
Most platforms will accept uppercase characters and interpret as if where lower case equivalents
Some platforms also allows the dash characters -
Instagram is the example of one that does not work with dash.
Usage of accents on Latin alphabet
Some platforms interpret a letter with accent as the same of one without (e.g. #joão\ meaning same as #Joao`. This is not consistent.
Usage of non-latin alphabets
Note: help need here!!!
Note on 2019-04-09 (creation of the issue): for now this issue does not have all challenges; this is just to start to make a central point about it.
TL;DR: This also is a permanent call for ideas (both from logic of link the data and some interface to show the tags), but if you want to take ideas back to the main repository, consider the non-English tags and also how to make easier for people contribute. The license (public domain) allows you to do your on fork/new project even without citing the original one; and we're ok if you do this.
Related:
Linked data of A/IS Ethics Tags is hard to make it simpler. This pinned issue on this repository is to make a single place to explain this (and also be referenced on other issues) for someone who do not want to read everything. Extra information could be added later, but for now:
Challenges for any project using only one language
Challenges for a project to link tags from multiple languages
Challenges related to alphabets of the languages used on the tags
The ideal scenario
Usage of accents on Latin alphabet
Usage of non-latin alphabets
Note on 2019-04-09 (creation of the issue): for now this issue does not have all challenges; this is just to start to make a central point about it.