initstring / linkedin2username

OSINT Tool: Generate username lists for companies on LinkedIn
MIT License
1.25k stars 185 forks source link

Recommendation for Name Parentheticals and Certificates #78

Open Menn1s opened 8 months ago

Menn1s commented 8 months ago

First off, really appreciate this tool and its regular updates!

I've had some general issues I've run into:

  1. Users who have names instead of certifications in parentheses. I've noticed more often than not, the certificates will simply be tacked on at the end.
  2. The extensive lists of certificates at the ends of names being taken as last names This results in names like: bob.cfa or dave.cba or john.pmp.

Parenthetical names

I have some really bad code dealing with the parenthetical names, but it butchers the existing code and moves some of the cleanup into the write function (which, if this is something you think is worth implementing, I would want to spend time to cleanup). So far it does this:

  1. If there are only 3 name parts and the middle is parenthetical, assume that middle is an alternative first name (I see this a lot with people who have non-english names). The result is two alternative email options with different first names. EG. Robert (Bob) Clark becomes robert.clark and bob.clark.
  2. If there are 2 name parts and the first has a parenthetical in it, it will break the name up and create two emails as well. EG. Robert(Bob) Clark becomes robert.clark and bob.clark.

Certifications

I have been collecting and using a list of certs I have run into in the past for a while, but I found it is pretty effective to use some regex to just find 2+ capitalized letters next to each other and deleting everything not a letter around it. I use this in vim: %s/\W*[A-Z]\{2,}\W*//g

So not sure what direction you might want to take with this; I understand there's no perfect solution and people do all sorts of crazy things with their Linkedin names.

initstring commented 6 months ago

Hi @Menn1s - thanks for opening an issue! Sorry it's taken me so long to reply. This is a nice detailed writeup, and it makes a lot of sense to me.

If you submit PRs for them, I would be happy to review. It would be great if you could implement tests here as well, to make sure everything is working.

Thanks!