dice-group / GerbilQA-Benchmarking-Template

A template for QA systems to benchmark with GERBIL
GNU Affero General Public License v3.0
0 stars 2 forks source link

AnswerType not being set #6

Closed param-jot closed 6 years ago

param-jot commented 6 years ago

How it is working? From where did you get: "mySystem.lastAnswer()". AnswerType type = AnswerType.valueOf(mySystem.lastAnswer().getType());

TortugaAttack commented 6 years ago

Hi,

it is just an example, as i do not know how your system looks like and works internally. Think about it as this:

Answer answer = MySystem.retrieveAnswer(question)
String answerTypeStr = answer.getType(); 
Set<String> answers = answer.getAnswerSet();

The answerTypeStr is one of the AnswerType Enum. See https://github.com/dice-group/GerbilQA-Benchmarking-Template/blob/e08b71e3df277a70d08f418056deca6cc3f02634/src/main/java/org/dice/qa/AnswerContainer.java#L19-L40

If your answer is a boolean it has to be set as AnswerType.BOOLEAN, if it is a uri it has to be AnswerType.RESOURCE etc.

Additionaly if you do not know the type you could check it with regex patterns etc. We do need the type though so for now there is no way around this. Sorry for the inconvience. As soon as i have more time i will make a method guessAnswerType(Set<String>) which then guesses the answerType, but my schedule is pretty full for the next weeks.

RicardoUsbeck commented 6 years ago

Does GERBIL really need it? I tried the following file and it just worked fine in GERBIL QA. So maybe you can remove the answertype.

{
    "dataset": {
        "id": "qald-7-test-multilingual"
    },
    "questions": [{
        "id": "2",
        "question": [{
            "language": "en",
            "string": "Are there any castles in the United States?",
            "keywords": "castles, United States"
        }],
        "query": {
            "sparql": "PREFIX dct: <http://purl.org/dc/terms/> PREFIX dbc: <http://dbpedia.org/resource/Category:> ask where {?uri dct:subject dbc:Castles_in_the_United_States}"
        },
        "answers": [{
            "head": {},
            "boolean": true
        }]
    }]
}
TortugaAttack commented 6 years ago

Not for the QALD but for the clarification of boolean or uris/literals. Further on we would have to guess it which may lead to complications, Systems probably know the datatype. You could also just set it to either BOOLEAN or RESOURCE (even though i think it is better if the correct Answer Type is set, so gerbils AnswerType Experiment Type can work in the future.)

RicardoUsbeck commented 6 years ago

AT Experiment Type should only give a result if the extended QALD format is used and not in the normal QALD format. If the answer is boolean the JSON substring looks like this:

"answers": [{
            "head": {},
            "boolean": true
        }]

and if it is resource like this (according to https://www.w3.org/TR/sparql11-results-json/).

"answers": [{
  "head": { "vars": [ "book" , "title" ]
  } ,
  "results": { 
    "bindings": [
      {
        "book": { "type": "uri" , "value": "http://example.org/book/book6" } ,
        "title": { "type": "literal" , "value": "Harry Potter and the Half-Blood Prince" }
      } 
    ]
  }
}}]

Thus you do not need the answerType to distinguish resource and boolean.

TortugaAttack commented 6 years ago

yes i do need to know if the AT is either a boolean or a resource/literal to build the QALD ;) Your example shows it perfectly.

Of course the AbstractSystem could guess it, but guessing is problematic. further on to set the type parameter in the bindings we have to distinuish between at least a literal and an uri.

But if you insist on it i will make it optional and if not set, the AbstractSystem will guess it.

RicardoUsbeck commented 6 years ago

Yes, you are right, I know you need it but it does not need to be in the answer JSON and thus can be left out.

However, we need a way to determine whether an answer is boolean. Maybe this simple thing does the trick? if (set.size==1 && (set.get(0)== true || set.get(0)==false)

TortugaAttack commented 6 years ago

exactly what i would do in the guessing.

This is probably the most accurate we get and only in a particular thing not correct. Which is very specialized so it will not occur and even if, Gerbil would probably even handle it correct.

I will implement the guessAnswerType method somewhen this week ;)

RicardoUsbeck commented 6 years ago

Take your time, another way is, that we could assume that there is a function buildanswer(Set set, boolean isItReallyBoolean)

Since we provide only a template, we do not need to provide a guessing way, the user just have to know when to set isItReallyBoolean to true or false and how then to fill the set correctly

I am still pondering what would be the easiest way to use it.

I saw until now 3 implementations which basically commented out all the boolean and answertype lines in the main class. (@rrichajalota @param-jot @berberer)

TortugaAttack commented 6 years ago

If the answerType is not set it will now be guessed if it is either a boolean a uri or a literal

The System provider can still set the AnswerType for a more correct one, (e.g. if the provider knows if it is a boolean, just set AnswerType.BOOLEAN) but it is not neccessary any more.