SuLab / GeneWikiCentral

GeneWiki Organization
MIT License
5 stars 2 forks source link

filter out "none" as a gene alias #119

Open andrewsu opened 4 years ago

andrewsu commented 4 years ago

Per this bug report, the Gene bot added "None" as a gene alias. Our bot should filter aliases like this out. More generally, I'm sure there are other such nonsensical aliases, so we should plan to have a list of blacklisted aliases to easily incorporate similar examples we encounter in the future.

andrawaag commented 4 years ago

Reopen it for now, I made a fix by introducing not_worth_adding as a list not to add as alias. It currently contains:

not_worth_adding = { "none", "None", "gene", "Gene" }

The list can be extended at https://github.com/SuLab/scheduled-bots/blob/master/scheduled_bots/geneprotein/__init__.py#L31

The issue got closed automatically when this was referenced in the commit of the fix. Reopen it for now until subsequent Jenkins runs confirm its working.

andrewsu commented 4 years ago

related issue (and fix) on protein aliases: https://github.com/SuLab/scheduled-bots/issues/8