dice-group / gerbil

GERBIL - General Entity annotatoR Benchmark
GNU Affero General Public License v3.0
224 stars 58 forks source link

deny the annotator #431

Closed Lisilin87 closed 6 months ago

Lisilin87 commented 1 year ago

GERBIL deny my annotator without any error output, so I don't know how to debug, The web page: image The 1234 port shows: image The 1235 port shows: image

My annotator is follow the Entqa, and the annotater return '[]' a empty list will get the same web page. How to solve these.

MichaelRoeder commented 1 year ago

Thank you for using GERBIL to benchmark your approach.

I am not sure whether I understand your problem correctly. How do you know that GERBIL does not accept your annotator? From your screenshots, everything looks fine.

Lisilin87 commented 1 year ago

Sorry,when i saw the '×', i subconsciously thought GERBIL refuse my annotator. Now, i can run experiment successfully. But i still encounter some problems。I run my entity linking system only on AIDA-CONLL test-B,but my annotater still receive nothing after 1 hour, and the 1234 port give me some warn: image Is this normal ? Is this error the same as #137

MichaelRoeder commented 1 year ago

I think you are facing two issues at once.

1. CoNLL File Format

The CoNLL files are typically tab-seperated files. I assume that your file has been edited by a program, which was configured wrongly and thought that it has to add quotation marks " around IRIs, which contain a comma, e.g., http://en.wikipedia.org/wiki/Washington,_D.C. would be transformed to "http://en.wikipedia.org/wiki/Washington,_D.C.". GERBIL reads these values but does not expect the quotation marks and, hence, is not able to understand this information.

However, since I don't know your file, I would suggest to open the file and check whether there are quotation marks around some of the IRIs.

2. Entity Checking and SameAs Retrieval

I can imagine that you face the same situation as in #137. Depending on the system that you would like to benchmark, I see several ways to go. You can disable the complete functionality or single parts of it. That will make your experiments faster but you will have to be careful with your results since they rely on an old, partly outdated dataset. Depending on the version of the knowledge base your system relies on, the results might be partly wrong.

You can also disable single parts of it, e.g., you could create a local DBpedia index and disable calls to the online DBpedia. Or you invest the time and let it run. It will take quite some time but if you activate all the caches, it should do the job only once.

MichaelRoeder commented 6 months ago

Closed after a year of inactivity.