italiaremote / awesome-italia-remote

A list of remote-friendly or full-remote companies that targets Italian talents.
MIT License
2.33k stars 305 forks source link

Tags normalization #401

Open FrancescoManfredi opened 4 months ago

FrancescoManfredi commented 4 months ago

A high number of tags refer to the same concept with different wording or different casing/styling for the same words.
It might be a good idea to add a normalization pipeline for the tags in each company.
Here is a mapping from original to normalized tags in the form of a python dict (easily convertible in any other format) that might be useful as a starting point: https://github.com/FrancescoManfredi/AIRV-analysis/blob/main/tags_repl.py
I'm the author of that mapping and this is an invite to make use of it in any way you prefer.

edoardocostantinidev commented 4 months ago

Hi @FrancescoManfredi, first of all thanks for your input and the blog post! Super fascinating. I agree this an issue that can be fixed relatively easily. We'll probably convert your mapping to integrate it into our golang validator/generator so we stick to a single language.

I personally don't have much time these days to tackle the issue but if no one picks it up by mid of May I'll try and tackle it myself.