Open Zoher15 opened 3 years ago
This has a simple answer thankfully. The knowledge base (Wikipedia) that they used to generate their pre-trained models is based on an Aug. 2019 view of the world, as they explain in their papers. The solution is to then retrain the models on a more recent Wikipedia dump. Of course, they don't make dataloading or training on fresh Wikipedia dumps very straightforward, and there's little documentation, but it can be done with the building blocks they have provided throughout the repo.
Hi all,
You can refer to my answer https://github.com/facebookresearch/BLINK/issues/106#issuecomment-1014507351 regarding adding new entities and generating embeddings for them.
Thanks!
This seems really really odd. "covid 19" is resolved as multiple things like 'Severe acute respiratory syndrome-related coronavirus', 'Indiana vesiculovirus', 'Coombs test', 'Caprine arthritis encephalitis virus', 'CORC'. But never 'COVID-19' as on wikipedia. Any way to fix this? Given the importance of the entity