dice-group / gerbil

GERBIL - General Entity annotatoR Benchmark
GNU Affero General Public License v3.0
224 stars 58 forks source link

[QA] Preimplemented systems got stuck #184

Closed RicardoUsbeck closed 7 years ago

RicardoUsbeck commented 7 years ago

After starting an experiment the system got stuck for more than 20h.

http://gerbil-qa.aksw.org/gerbil/experiment?id=201703060000

image

TortugaAttack commented 7 years ago

Is it only OKBQA, or other Systems as well?

MichaelRoeder commented 7 years ago

Since the Apache HTTP client got stuck, the system does not seem to use the HTTP management implemented in GERBIL. This would simply close the request from outside if the request needs too much time.

Please have a look at recent implementations of systems in GERBIL, e.g., FOX

public class FOXAnnotator extends AbstractHttpBasedAnnotator implements ... {

    ...

    protected Document requestAnnotations(Document document) throws GerbilException {
        Document resultDoc = new DocumentImpl(document.getText(), document.getDocumentURI());
        HttpEntity entity = new StringEntity(new JSONObject().put("input", document.getText()).put("type", "text")
                .put("task", "ner").put("output", "JSON-LD").toString(), ContentType.APPLICATION_JSON);
        // request FOX
        HttpPost request = null;
        try {
            request = createPostRequest(serviceUrl);
        } catch (IllegalArgumentException e) {
            throw new GerbilException("Couldn't create HTTP request.", e, ErrorTypes.UNEXPECTED_EXCEPTION);
        }
        ...
        CloseableHttpResponse response = null;
        try {
            response = sendRequest(request);

            entity = response.getEntity();
            try {
                String content = IOUtils.toString(entity.getContent(),
                        ContentType.APPLICATION_JSON.getCharset().name());
                // parse results
                JSONObject outObj = new JSONObject(content);
                if (outObj.has("@graph")) {

                    JSONArray graph = outObj.getJSONArray("@graph");
                    for (int i = 0; i < graph.length(); i++) {
                        parseType(graph.getJSONObject(i), resultDoc);
                    }
                } else {
                    parseType(outObj, resultDoc);
                }
            } catch (Exception e) {
                LOGGER.error("Couldn't parse the response.", e);
                throw new GerbilException("Couldn't parse the response.", e, ErrorTypes.UNEXPECTED_EXCEPTION);
            }
        } finally {
            if (entity != null) {
                try {
                    EntityUtils.consume(entity);
                } catch (IOException e1) {
                }
            }
            if (response != null) {
                try {
                    response.close();
                } catch (IOException e) {
                }
            }
            closeRequest(request);
        }
        return resultDoc;
    }

    ...
}

It can be seen that the HTTP request is managed by using the methods createPostRequest(serviceUrl) or createGetRequest(serviceUrl), sendRequest(request); and closeRequest(request).

MichaelRoeder commented 7 years ago

Please check all systems regarding the usage of HTTP management (even if this time OKBQA got stuck).

TortugaAttack commented 7 years ago

Problem here is, that the pre implemented systems are implemented in the qa.systems which do not use Gerbil as a library and if i get it right it should not. Thus i can try to add a manual timeOut but without using Gerbil i cannot implement the HTTP managment. :/

TortugaAttack commented 7 years ago

I would have a solution, adding a SocketTO with a maximum Time (same as the HTTP managment max time). This works fine and the problem with OKBQA as well as SINA are the SocketTOs.

TortugaAttack commented 7 years ago

done with Socket TimeOuts (tested) in qa.Systems