Closed isspek closed 7 years ago
Hi,
Thanks for using GERBIL :smiley:
GERBIL tries to a) check the validity of the URIs in the gold standard and b) retrieve additional URIs for the entities mentioned. This increases the quality of the evaluation. Unfortunately, it increases the runtime as well (at least in the first experiment, since the retrieved information are cached for later reusage) and it needs internet acces to dereference entity URIs.
There are different ways how you can handle that. If you do not want to wait for these steps, you can deactivate them as described here. However, this might lead to lower scores for the benchmarked annotation system.
You can also download Lucene indexes for the DBpedia KB, which can be used locally and are much faster then retrieving the information from the net. When running the start.sh
script, you are asked whether you want to download the scripts.
Does this answer your question?
Cheers, Michael
Thank you, Michael, for the explanation. To run the project on the server which is Tomcat, I have packaged the master folder as war so I don't run start.sh script. In case I download Lucene indexes and use them locally, should I locate them in gerbil data folder. In other words, which modification is required in the project for avoiding run script.
Bests,
Ipek.
Next to the gerbil_data
directory that you have somewhere, you should create an indexes
directory containing the two indexes. The structure could look like the following (where root
is the root directory of the project that contains the gerbil_data
directory):
root
|
+- gerbil_data
|
+- indexes
|
+- dbpedia
|
+- dbpedia_check
The indexes can be downloaded from http://139.18.2.164/mroeder/gerbil/
Please note that with using the indexes and deactivating the checks for Wikipedia entities, you shouldn't see any network errors during an experiment. If you still see them, feel free to post them here. If it works after the changes you made and you are happy with the solution, please do not forget to close this issue.
@MichaelRoeder I manually installed indexes and caches on the machine. Now I have below issue:
WARN [org.aksw.gerbil.dataset.datahub.DatahubNIFLoader] - <Couldn't get any datasets with the gerbil tag from DataHubIO. Exception: org.springframework.web.client.ResourceAccessException: I/O error on GET request for "http://datahub.io/api/1/rest/tag/gerbil": Connection timed out (Connection timed out); nested exception is java.net.ConnectException: Connection timed out (Connection timed out)>
The machine connects through a proxy server. So this exception occurs. To fix it, I will need to pass proxy settings. Is it possible with properties file?
This warning can be ignored. It would occur with internet access as well and it has no influence on your results.
@MichaelRoeder This time I received time connection exceptions while executing request on SpotClient annotators. Below modifications fixed my problem and could be referenced for ones who works in proxy server:
1-Updated HttpManagement as the link
2- Added org.aksw.gerbil.annotator.http.HttpManagement.proxyHost=host
and org.aksw.gerbil.annotator.http.HttpManagement.proxyPort=port
to gerbil.properties
I also modified DbPedia Spotlight service url as org.aksw.gerbil.annotator.impl.spotlight.SpotlightAnnotator.ServieURL=http://model.dbpedia-spotlight.org/en/ . It is not related to proxy problem but good to be noted.
Thanks for sharing your solution. I created a feature request regarding the need of a proxy support (#199).
Does your solution work as expected? It would be nice if you could share it with a pull request so we could merge it back to GERBIL.
Yes, it works for me. I will do pr in a few days.
19 Haz 2017 14:13 tarihinde "Michael Röder" notifications@github.com yazdı:
Thanks for sharing your solution. I created a feature request regarding the need of a proxy support (#199 https://github.com/AKSW/gerbil/issues/199).
Does your solution work as expected? It would be nice if you could share it with a pull request so we could merge it back to GERBIL.
— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/AKSW/gerbil/issues/198#issuecomment-309410140, or mute the thread https://github.com/notifications/unsubscribe-auth/AGpLnwQ-6j2U0SRE3vP6_whGX5hlE-Ggks5sFlf0gaJpZM4N0Feq .
Hi,
I am running D2KB experiments with DBpedia Spotlight and custom annotator on Gerbil installed in a private server. The experiments takes too long and I am receiving connection errors such as the below one:
How should I fix it?
Thanks.