Open Hevia opened 1 year ago
We can enrich the KG with additional knowledge from Wikipedia & Wikispecies. It would be helpful to find methods of automatically identifying these mentions in a text
If we stick down the current route of multiple entity extractors (see the RE issue #12) we also need a ground truth entity labeling method and ensure all entities resolve down to whatever the Wikipedia labeler we end up using is
Models to eval: https://huggingface.co/RJuro/SciNERTopic
More ground truth sources:
I can recommend entityfishing or dbpedia spotlight.
This holds information on how to build this in a VM. https://github.com/dbpedia-spotlight/spotlight-docker
After deployment SSH into VM via GCP VM button.
https://www.digitalocean.com/community/tutorials/how-to-install-and-use-docker-on-debian-10
$ sudo apt update
$ sudo apt install apt-transport-https ca-certificates curl gnupg2 $ software-properties-common
$ curl -fsSL https://download.docker.com/linux/debian/gpg | sudo apt-key add -
$ sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/debian $(lsb_release -cs) stable"
$ sudo apt update
$ apt-cache policy docker-ce
$ sudo apt install docker-ce
$ sudo systemctl status docker
https://github.com/dbpedia-spotlight/spotlight-docker
$ docker run -tid --restart unless-stopped --name dbpedia-spotlight.en --mount source=spotlight-model,target=/opt/spotlight -p 2222:80 dbpedia/dbpedia-spotlight spotlight.sh en
This holds information on how to build this in a VM. https://nerd.readthedocs.io/en/latest/build.html
After deployment SSH into VM via GCP VM button.
install Git and JDK https://www.digitalocean.com/community/tutorials/how-to-install-git-on-debian-10 https://linuxize.com/post/install-java-on-debian-10/
These packages are needed to build grobid
$ git clone https://github.com/kermitt2/grobid.git --branch 0.7.1
$ cd grobid
$ ./gradlew clean install
$ git clone https://github.com/kermitt2/grobid-ner.git
$ cd grobid-ner
$ ./gradlew copyModels
$ ./gradlew clean install
$ cd ..
$ cd ..
install unzip and install entity fishing https://linuxize.com/post/how-to-unzip-files-in-linux/ www.compciv.org/recipes/cli/downloading-with-curl/ https://unix.stackexchange.com/questions/479/keep-processes-running-after-ssh-session-disconnects
$ git clone https://github.com/kermitt2/entity-fishing.git
$ cd entity-fishing
$ curl https://science-miner.s3.amazonaws.com/entity-fishing/0.0.5/db-kb.zip --output db-kb.zip
$ curl https://science-miner.s3.amazonaws.com/entity-fishing/0.0.5/db-en.zip --output db-en.zip
$ sudo apt install unzip
$ unzip db-kb.zip -d data/db/
$ unzip db-en.zip -d data/db/
$ ./gradlew clean build
$ nohup ./gradlew run
https://github.com/facebookresearch/BLINK