Open Phyks opened 6 years ago
Something similar was suggested in #96 (equivalency terms). I'm not sure this falls within the scope of Shaarli - but maybe there's a KISS way to implement it. Edit: sorry #968
In the mean time (and because I often have the same problem with singular/plural tags), you can go to Tag cloud > Alphabetical
or Most used
, which helps
Photo
and Photos
will be next to each other)gmaes
tag with 1 item...)Hope this helps, let me know if it should be better documented
Oh, indeed, thanks for the tip!
Sorry, I completely missed #96 when searching for similar issues :/
This could be achieved with language-specific dictionaries+stemming and/or a Natural Language Processing approach using a lexical database:
I've worked with WordNet and Python libraries like NTLK and Spacy to address similar needs, but I'm not sure there are such tools available for (easy) integration in a PHP application.
The most straightforward approach might be to implement this (as a first step) as a command-line utility to python-shaarli-client.
@nodiscc I think you made a mistake, #96 doesn't seem to be related.
@virtualtam Without going to the usage of language-specific lexical databases, PHP provides built-in functions to calculate the similarity/distance between strings, such as similar_text and levenshtein. We could easily make a very basic function to detect similar strings.
However, I'm not sure how it should work in the UI. Maybe another block in ?do=changetag
page?
This 3rd party API seems to be pretty straightforward for synonyms: https://www.datamuse.com/api/
Moved from #1310
When adding a new link and attributing tags to it, I'm often wondering if I do not have already another different tag conveying the same idea. And I'm always bothered by the fact that I could create several different tags for the same purpose.
Could we imagine a way to define kind of synonyms for tags so that when we type that synonym in the tag field of the add link or edit link page, the corresponding existing tag appears and I can choose it instead.
Another way would be to have a mapping engine that uses external source to display those propositions automatically based on the meaning of words. But I guess we're talking about much bigger effort in that case.
I don't know. I understand the use case, but maintaining a synonyms database within Shaarli might be a bit overkill.
Hi,
Thanks for the work you are putting in maintaining this fork of Shaarli! I have been using it for a couple of years now and realize I have been using slightly different tags for the same category over the course of years.
For instance, sometimes I have "Photo", sometimes I have "Photos", resulting in two different tags with very close spelling.
So, I've just had the idea of adding a new "merge close tags" feature which would try to detect close tags and list them with an option to merge them together. What do you think about it?
For the close tags detection, I think something as simple as stemming might actually be enough.