facebookresearch / BLINK

Entity Linker solution
MIT License
1.17k stars 232 forks source link

ELQ cannot detect covid 19 #86

Open Zoher15 opened 3 years ago

Zoher15 commented 3 years ago

This seems really really odd. "covid 19" is resolved as multiple things like 'Severe acute respiratory syndrome-related coronavirus', 'Indiana vesiculovirus', 'Coombs test', 'Caprine arthritis encephalitis virus', 'CORC'. But never 'COVID-19' as on wikipedia. Any way to fix this? Given the importance of the entity

shellshock1911 commented 3 years ago

This has a simple answer thankfully. The knowledge base (Wikipedia) that they used to generate their pre-trained models is based on an Aug. 2019 view of the world, as they explain in their papers. The solution is to then retrain the models on a more recent Wikipedia dump. Of course, they don't make dataloading or training on fresh Wikipedia dumps very straightforward, and there's little documentation, but it can be done with the building blocks they have provided throughout the repo.

abhinavkulkarni commented 2 years ago

Hi all,

You can refer to my answer https://github.com/facebookresearch/BLINK/issues/106#issuecomment-1014507351 regarding adding new entities and generating embeddings for them.

Thanks!