Closed cristinacivili-work closed 2 years ago
Hi @cristinacivili-work,
An example request would look like this to trigger a classification on the MAIN branch:
POST http://localhost:8080/snowowl/snomed-ct/v3/classifications
{
"branch": "MAIN",
"reasonerId": "org.semanticweb.elk.elk.reasoner.factory"
}
The equivalent curl command:
curl -X POST "http://localhost:8080/snowowl/snomed-ct/v3/classifications" -H "Content-Type: application/json;charset=UTF-8" -d "{ \"branch\": \"MAIN\", \"reasonerId\": \"org.semanticweb.elk.elk.reasoner.factory\"}"
I hope this works for you. Let me know if you have any further questions.
Cheers, Mark
Hi Mark, thanks for your prompt answer!
The one above looks like my first attempt. Any idea on what could cause a failure?
Hi @cristinacivili-work,
Do you have access to the logs? That might have something that can be used to figure out what's the root cause of the issue.
The logs are located in Snow Owl's serviceability directory <SO_HOME>/serviceability/logs/log.log
.
In case of using the official docker container, use the docker logs <container-name>
command to access the logs.
Regards, Mark
Hi @cristinacivili-work,
Just tried on our latest 7.7.0-SNAPSHOT server, it works fine with those settings. Here are the relevant log lines for reference:
[2020-06-03T13:21:30.946] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa lock Lock acquired for branch 'MAIN' of repository 'snomedStore' (classifying the ontology).
[2020-06-03T13:21:30.967] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa reasoner-taxonomy >>> Registering active concept IDs using revision searcher
[2020-06-03T13:21:34.250] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa reasoner-taxonomy <<< Registering active concept IDs using revision searcher [@3.300 s]
[2020-06-03T13:21:34.251] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa reasoner-taxonomy >>> Registering concept IDs from stream
[2020-06-03T13:21:34.252] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa reasoner-taxonomy <<< Registering concept IDs from stream [@3.302 s]
[2020-06-03T13:21:34.262] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa reasoner-taxonomy >>> Registering active concept flags (exhaustive) using revision searcher
[2020-06-03T13:21:34.269] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa reasoner-taxonomy <<< Registering active concept flags (fully defined, exhaustive) using revision searcher [@3.320 s]
[2020-06-03T13:21:34.269] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa reasoner-taxonomy >>> Registering active stated IS A graph edges using revision searcher
[2020-06-03T13:21:34.312] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa reasoner-taxonomy <<< Registering active stated IS A graph edges using revision searcher [@3.363 s]
[2020-06-03T13:21:34.312] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa reasoner-taxonomy >>> Registering active stated non-IS A relationships using revision searcher
[2020-06-03T13:21:34.377] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa reasoner-taxonomy <<< Registering active stated non-IS A relationships using revision searcher [@3.428 s]
[2020-06-03T13:21:34.377] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa reasoner-taxonomy >>> Registering active inferred relationships using revision searcher
[2020-06-03T13:21:44.516] INFO eventbus-1592 request {"locales":[],"expand":{},"fields":[],"limit":50,"options":{"SORT_BY":[{"ascending":false,"field":"timestamp"}]},"operation":"browse","type":"CommitInfoSearchRequest","repositoryId":"valuesetStore","metrics":{"responseTime":3}}
[2020-06-03T13:21:54.036] INFO eventbus-1651 request {"locales":[],"expand":{},"fields":[],"limit":50,"options":{"SORT_BY":[{"ascending":false,"field":"timestamp"}]},"operation":"browse","type":"CommitInfoSearchRequest","repositoryId":"valuesetStore","metrics":{"responseTime":3}}
[2020-06-03T13:22:03.392] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa reasoner-taxonomy <<< Registering active inferred relationships using revision searcher [@32.44 s]
[2020-06-03T13:22:03.392] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa reasoner-taxonomy >>> Registering active additional grouped relationships using revision searcher
[2020-06-03T13:22:03.410] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa reasoner-taxonomy <<< Registering active additional grouped relationships using revision searcher [@32.46 s]
[2020-06-03T13:22:03.411] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa reasoner-taxonomy >>> Registering 'never grouped' type IDs using revision searcher
[2020-06-03T13:22:04.035] INFO eventbus-1656 request {"locales":[],"expand":{},"fields":[],"limit":50,"options":{"SORT_BY":[{"ascending":false,"field":"timestamp"}]},"operation":"browse","type":"CommitInfoSearchRequest","repositoryId":"valuesetStore","metrics":{"responseTime":18}}
[2020-06-03T13:22:05.097] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa reasoner-taxonomy <<< Registering 'never grouped' type IDs using revision searcher [@34.15 s]
[2020-06-03T13:22:05.097] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa reasoner-taxonomy >>> Registering active stated OWL axioms using revision searcher
[2020-06-03T13:22:13.294] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa reasoner-taxonomy <<< Registering active stated OWL axioms using revision searcher [@42.34 s]
[2020-06-03T13:22:13.294] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa reasoner-taxonomy >>> Registering active concept flags (exhaustive, fully defined) using concept ID stream
[2020-06-03T13:22:13.294] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa reasoner-taxonomy <<< Registering active concept flags (exhaustive, fully definied) using concept ID stream [@42.34 s]
[2020-06-03T13:22:13.295] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa reasoner-taxonomy >>> Registering active stated IS A graph edges from relationship stream
[2020-06-03T13:22:13.295] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa reasoner-taxonomy <<< Registering active stated IS A graph edges from relationship stream [@42.35 s]
[2020-06-03T13:22:13.295] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa reasoner-taxonomy >>> Registering active stated non-IS A relationships using relationship stream
[2020-06-03T13:22:13.295] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa reasoner-taxonomy <<< Registering active stated non-IS A relationships using relationship stream [@42.35 s]
[2020-06-03T13:22:13.296] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa reasoner-taxonomy >>> Registering active inferred relationships using relationship stream
[2020-06-03T13:22:13.296] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa reasoner-taxonomy <<< Registering active inferred relationships using relationship stream [@42.35 s]
[2020-06-03T13:22:13.296] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa reasoner-taxonomy >>> Registering active additional grouped relationships using relationship stream
[2020-06-03T13:22:13.296] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa reasoner-taxonomy <<< Registering active additional grouped relationships using relationship stream [@42.35 s]
[2020-06-03T13:22:13.513] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa lock Lock released for branch 'MAIN' of repository 'snomedStore' (classifying the ontology).
[2020-06-03T13:22:13.730] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa reasoner-taxonomy-inferrer >>> Classification and inferred taxonomy extraction
[2020-06-03T13:22:13.735] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa org.semanticweb.elk.config.ConfigurationFactory Loading default configuration parameters for class org.semanticweb.elk.reasoner.config.ReasonerConfiguration
[2020-06-03T13:22:13.810] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa org.semanticweb.elk.reasoner.Reasoner ELK reasoner was created
[2020-06-03T13:22:13.812] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa org.semanticweb.elk.reasoner.stages.LoggingStageExecutor Loading of Axioms started
[2020-06-03T13:22:13.814] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa org.semanticweb.elk.reasoner.stages.AbstractReasonerStage Loading of Axioms using 12 workers
[2020-06-03T13:22:13.814] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa reasoner-taxonomy-inferrer --- Reasoner task: Loading [--.-%]
[2020-06-03T13:22:14.016] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa reasoner-taxonomy-inferrer --- Reasoner task: Loading [00.5%]
[2020-06-03T13:22:14.020] INFO eventbus-1622 request {"locales":[],"expand":{},"fields":[],"limit":50,"options":{"SORT_BY":[{"ascending":false,"field":"timestamp"}]},"operation":"browse","type":"CommitInfoSearchRequest","repositoryId":"valuesetStore","metrics":{"responseTime":3}}
[2020-06-03T13:22:20.504] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa org.semanticweb.elk.reasoner.stages.LoggingStageExecutor Loading of Axioms took 6690 ms
[2020-06-03T13:22:20.504] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa org.semanticweb.elk.reasoner.stages.LoggingStageExecutor Property Saturation Initialization started
[2020-06-03T13:22:20.504] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa org.semanticweb.elk.reasoner.stages.AbstractReasonerStage Property Saturation Initialization using 12 workers
[2020-06-03T13:22:20.504] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa org.semanticweb.elk.reasoner.stages.LoggingStageExecutor Property Saturation Initialization took 0 ms
[2020-06-03T13:22:20.504] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa org.semanticweb.elk.reasoner.stages.LoggingStageExecutor Reflexive Property Computation started
[2020-06-03T13:22:20.509] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa org.semanticweb.elk.reasoner.stages.AbstractReasonerStage Reflexive Property Computation using 12 workers
[2020-06-03T13:22:20.512] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa org.semanticweb.elk.reasoner.stages.LoggingStageExecutor Reflexive Property Computation took 8 ms
[2020-06-03T13:22:20.513] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa org.semanticweb.elk.reasoner.stages.LoggingStageExecutor Object Property Hierarchy and Composition Computation started
[2020-06-03T13:22:20.517] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa org.semanticweb.elk.reasoner.stages.AbstractReasonerStage Object Property Hierarchy and Composition Computation using 12 workers
[2020-06-03T13:22:20.529] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa org.semanticweb.elk.reasoner.stages.LoggingStageExecutor Object Property Hierarchy and Composition Computation took 16 ms
[2020-06-03T13:22:20.530] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa org.semanticweb.elk.reasoner.stages.LoggingStageExecutor Context Initialization started
[2020-06-03T13:22:20.530] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa org.semanticweb.elk.reasoner.stages.AbstractReasonerStage Context Initialization using 12 workers
[2020-06-03T13:22:20.531] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa org.semanticweb.elk.reasoner.stages.LoggingStageExecutor Context Initialization took 1 ms
[2020-06-03T13:22:20.531] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa org.semanticweb.elk.reasoner.stages.LoggingStageExecutor Consistency Checking started
[2020-06-03T13:22:20.533] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa org.semanticweb.elk.reasoner.stages.AbstractReasonerStage Consistency Checking using 12 workers
[2020-06-03T13:22:20.539] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa org.semanticweb.elk.reasoner.stages.LoggingStageExecutor Consistency Checking took 7 ms
[2020-06-03T13:22:20.539] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa org.semanticweb.elk.reasoner.stages.LoggingStageExecutor Class Taxonomy Computation started
[2020-06-03T13:22:20.545] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa org.semanticweb.elk.reasoner.stages.AbstractReasonerStage Class Taxonomy Computation using 12 workers
[2020-06-03T13:22:23.273] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa org.semanticweb.elk.reasoner.stages.LoggingStageExecutor Class Taxonomy Computation took 2733 ms
[2020-06-03T13:22:24.020] INFO eventbus-1649 request {"locales":[],"expand":{},"fields":[],"limit":50,"options":{"SORT_BY":[{"ascending":false,"field":"timestamp"}]},"operation":"browse","type":"CommitInfoSearchRequest","repositoryId":"valuesetStore","metrics":{"responseTime":3}}
[2020-06-03T13:22:27.349] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa reasoner-taxonomy-inferrer <<< Classification and inferred taxonomy extraction [13.62 s]
[2020-06-03T13:22:27.400] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa org.semanticweb.elk.reasoner.Reasoner ELK reasoner has shut down
[2020-06-03T13:22:27.408] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa c.b.snowowl.snomed.reasoner.normalform.NormalFormGenerator >>> Distribution normal form generation
[2020-06-03T13:22:27.409] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa c.b.snowowl.snomed.reasoner.normalform.NormalFormGenerator --- Initializing node graphs for types: [733928003, 774081006, 738774007, 733930001]
[2020-06-03T13:22:34.058] INFO eventbus-1593 request {"locales":[],"expand":{},"fields":[],"limit":50,"options":{"SORT_BY":[{"ascending":false,"field":"timestamp"}]},"operation":"browse","type":"CommitInfoSearchRequest","repositoryId":"valuesetStore","metrics":{"responseTime":2}}
[2020-06-03T13:22:44.027] INFO eventbus-1592 request {"locales":[],"expand":{},"fields":[],"limit":50,"options":{"SORT_BY":[{"ascending":false,"field":"timestamp"}]},"operation":"browse","type":"CommitInfoSearchRequest","repositoryId":"valuesetStore","metrics":{"responseTime":3}}
[2020-06-03T13:22:54.040] INFO eventbus-1651 request {"locales":[],"expand":{},"fields":[],"limit":50,"options":{"SORT_BY":[{"ascending":false,"field":"timestamp"}]},"operation":"browse","type":"CommitInfoSearchRequest","repositoryId":"valuesetStore","metrics":{"responseTime":2}}
[2020-06-03T13:23:27.811] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa c.b.snowowl.snomed.reasoner.normalform.NormalFormGenerator --- Use node graphs for hierarchy computation
[2020-06-03T13:23:52.266] INFO eventbus-1656 request {"type":"ServerInfoGetRequest","metrics":{"responseTime":0}}
[2020-06-03T13:24:32.938] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa c.b.snowowl.snomed.reasoner.normalform.NormalFormGenerator <<< Distribution normal form generation [2.092 min]
[2020-06-03T13:24:32.947] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa index.snomedstore Updated 1 classificationtask documents with bulk update with script 'completed'
[2020-06-03T13:24:32.987] INFO Worker-1: a7b515f7-200d-4f11-b4e8-cdad161726aa index.jobs Updated 1 job documents with bulk update with script 'withDone'
Cheers, Mark
Thank you Mark, we will try with the latest version then. I have two related questions:
Thank you for clarifying these features!
Hi @mattecasu,
What is the expected behaviour regarding inferred statements? Are stated and inferred statements disjoint sets, or should we expect the set of inferred statements to contain the stated ones too?
Please refer to the official SNOMED International documentations/papers regarding inferred statement generation: https://confluence.ihtsdotools.org/display/DOCOWL/2.5.+Generating+Necessary+Normal+Form+Relationships+from+the+OWL+Refsets https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2243264/pdf/procamiasymp00002-0666.pdf
Snow Owl generates the NNF from the available stated definitions during the normal form generation phase of the classification. A stated definition of a concept can become more specific and can result in more specified ISA relationships (the concept pushed down in the hierarchy to represent its correct placement), or other relationships can appear on the concept based on the supertypes defined via the ISA relationships.
Should we trigger the classification explicitly to produce inferred statements, or is it something internally scheduled/done at write time?
Snow Owl provides APIs to generate the inferred definitions of the concepts (aka classify the SNOMED CT content) and it does not schedule this periodically or generate the inferred definitions incrementally. The reason for this is that classification heavily depends on the authoring cycle a SNOMED CT Extension Authoring team uses. Some authors classify after each new concept, some have scheduled classification times, some classify only before the release, etc.. So Snow Owl cannot make any assumptions on when it would be a good time to classify, it relies on manual execution triggered by an external source (be it a human or a machine).
I hope this answers your questions.
Cheers, Mark
Good morning Mark, unfortunately we are still experiencing issues with triggering a classification run.
When I start a new classification task on a branch (exactly as you suggested above), the classification is created and then stays in a SCHEDULED state for days. So far we haven't been able to see any of these scheduled tasks change status.
In the DEBUG logs I see a pattern repeating every five minutes where each message starts with [Classification task cleanup] that could be related, but no ERROR log is involved.
This is what I found in the logs:
`
Jul 13, 2020 @ 17:52:36.952 16:52:36.952 [Classification task cleanup] DEBUG org.elasticsearch.client.RestClient - request [POST https://<host>.amazonaws.com/snomedstore-classificationtask/_search?typed_keys=true&max_concurrent_shard_requests=5&ignore_unavailable=false&expand_wildcards=open&allow_no_indices=true&ignore_throttled=true&search_type=query_then_fetch&batched_reduce_size=512&ccs_minimize_roundtrips=true] returned [HTTP/1.1 200 OK]
Jul 13, 2020 @ 17:52:36.947 16:52:36.947 [Classification task cleanup] DEBUG org.apache.http.client.protocol.RequestAddCookies - CookieSpec selected: default
Jul 13, 2020 @ 17:52:36.947 16:52:36.947 [Classification task cleanup] DEBUG org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager - Connection leased: [id: http-outgoing-50][route: {s}->https://<host>.amazonaws.com:443][total kept alive: 3; route allocated: 4 of 10; total allocated: 4 of 30]
Jul 13, 2020 @ 17:52:36.947 16:52:36.947 [Classification task cleanup] DEBUG org.apache.http.client.protocol.RequestAuthCache - Re-using cached 'basic' auth scheme for https://<host>.amazonaws.com:443
Jul 13, 2020 @ 17:52:36.947 16:52:36.947 [Classification task cleanup] DEBUG org.apache.http.impl.nio.client.InternalHttpAsyncClient - [exchange: 240592] Connection allocated: CPoolProxy{http-outgoing-50 [ACTIVE]}
Jul 13, 2020 @ 17:52:36.947 16:52:36.947 [Classification task cleanup] DEBUG org.apache.http.impl.nio.conn.ManagedNHttpClientConnectionImpl - http-outgoing-50 <host>:42854<-><host>:443[ACTIVE][r:r][ACTIVE][r][NOT_HANDSHAKING][0][0][0]: Set timeout 0
Jul 13, 2020 @ 17:52:36.947 16:52:36.947 [Classification task cleanup] DEBUG org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager - Connection request: [route: {s}->https://<host>.amazonaws.com:443][total kept alive: 4; route allocated: 4 of 10; total allocated: 4 of 30]
Jul 13, 2020 @ 17:52:36.947 16:52:36.947 [Classification task cleanup] DEBUG org.apache.http.impl.nio.client.InternalHttpAsyncClient - [exchange: 240592] Request connection for {s}->https://<host>.amazonaws.com:443
Jul 13, 2020 @ 17:52:36.947 16:52:36.947 [Classification task cleanup] DEBUG org.apache.http.client.protocol.RequestAuthCache - No credentials for preemptive authentication
Jul 13, 2020 @ 17:52:36.947 16:52:36.947 [Classification task cleanup] DEBUG org.apache.http.impl.nio.client.MainClientExec - [exchange: 240592] start execution
Jul 13, 2020 @ 17:52:36.947 16:52:36.947 [Classification task cleanup] DEBUG org.elasticsearch.client.RestClient - request [DELETE https://<host>.amazonaws.com/_search/scroll] returned [HTTP/1.1 200 OK]
Jul 13, 2020 @ 17:52:36.947 16:52:36.947 [Classification task cleanup] DEBUG org.apache.http.impl.nio.conn.ManagedNHttpClientConnectionImpl - http-outgoing-50 <host>:42854<-><host>:443[ACTIVE][r:r][ACTIVE][r][NOT_HANDSHAKING][0][0][0]: Set attribute http.nio.exchange-handler
Jul 13, 2020 @ 17:52:36.947 16:52:36.947 [Classification task cleanup] DEBUG org.apache.http.impl.nio.conn.ManagedNHttpClientConnectionImpl - http-outgoing-50 <host>:42854<-><host>:443[ACTIVE][rw:r][ACTIVE][rw][NOT_HANDSHAKING][0][0][0]: Event set [w]
Jul 13, 2020 @ 17:52:36.944 16:52:36.943 [Classification task cleanup] DEBUG org.apache.http.impl.nio.conn.ManagedNHttpClientConnectionImpl - http-outgoing-50 <host>:42854<-><host>:443[ACTIVE][rw:r][ACTIVE][rw][NOT_HANDSHAKING][0][0][0]: Event set [w]
Jul 13, 2020 @ 17:52:36.944 16:52:36.943 [Classification task cleanup] DEBUG org.apache.http.impl.nio.client.InternalHttpAsyncClient - [exchange: 240591] Connection allocated: CPoolProxy{http-outgoing-50 [ACTIVE]}
Jul 13, 2020 @ 17:52:36.944 16:52:36.943 [Classification task cleanup] DEBUG org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager - Connection leased: [id: http-outgoing-50][route: {s}->https://<host>.amazonaws.com:443][total kept alive: 3; route allocated: 4 of 10; total allocated: 4 of 30]
Jul 13, 2020 @ 17:52:36.944 16:52:36.943 [Classification task cleanup] DEBUG org.apache.http.impl.nio.conn.ManagedNHttpClientConnectionImpl - http-outgoing-50 <host>:42854<-><host>:443[ACTIVE][r:r][ACTIVE][r][NOT_HANDSHAKING][0][0][0]: Set attribute http.nio.exchange-handler
Jul 13, 2020 @ 17:52:36.944 16:52:36.943 [Classification task cleanup] DEBUG org.apache.http.impl.nio.conn.ManagedNHttpClientConnectionImpl - http-outgoing-50 <host>:42854<-><host>:443[ACTIVE][r:r][ACTIVE][r][NOT_HANDSHAKING][0][0][0]: Set timeout 0
Jul 13, 2020 @ 17:52:36.943 16:52:36.943 [Classification task cleanup] DEBUG org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager - Connection request: [route: {s}->https://<host>.amazonaws.com:443][total kept alive: 4; route allocated: 4 of 10; total allocated: 4 of 30]
Jul 13, 2020 @ 17:52:36.943 16:52:36.943 [Classification task cleanup] DEBUG org.apache.http.impl.nio.client.InternalHttpAsyncClient - [exchange: 240591] Request connection for {s}->https://<host>.amazonaws.com:443
Jul 13, 2020 @ 17:52:36.943 16:52:36.943 [Classification task cleanup] DEBUG org.apache.http.client.protocol.RequestAuthCache - No credentials for preemptive authentication
Jul 13, 2020 @ 17:52:36.943 16:52:36.943 [Classification task cleanup] DEBUG org.elasticsearch.client.RestClient - request [POST https://<host>.amazonaws.com/snomedstore-classificationtask/_search?typed_keys=true&max_concurrent_shard_requests=5&ignore_unavailable=false&expand_wildcards=open&allow_no_indices=true&ignore_throttled=true&scroll=60s&search_type=query_then_fetch&batched_reduce_size=512&ccs_minimize_roundtrips=true] returned [HTTP/1.1 200 OK]
Jul 13, 2020 @ 17:52:36.943 16:52:36.943 [Classification task cleanup] DEBUG org.apache.http.client.protocol.RequestAddCookies - CookieSpec selected: default
Jul 13, 2020 @ 17:52:36.943 16:52:36.943 [Classification task cleanup] DEBUG org.apache.http.client.protocol.RequestAuthCache - Re-using cached 'basic' auth scheme for https://<host>.amazonaws.com:443
Jul 13, 2020 @ 17:52:36.943 16:52:36.943 [Classification task cleanup] DEBUG org.apache.http.impl.nio.client.MainClientExec - [exchange: 240591] start execution
Jul 13, 2020 @ 17:52:36.939 16:52:36.939 [Classification task cleanup] DEBUG org.apache.http.client.protocol.RequestAddCookies - CookieSpec selected: default
Jul 13, 2020 @ 17:52:36.939 16:52:36.939 [Classification task cleanup] DEBUG org.apache.http.impl.nio.conn.ManagedNHttpClientConnectionImpl - http-outgoing-50 <host>:42854<-><host>:443[ACTIVE][rw:r][ACTIVE][rw][NOT_HANDSHAKING][0][0][0]: Event set [w]
Jul 13, 2020 @ 17:52:36.939 16:52:36.939 [Classification task cleanup] DEBUG org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager - Connection request: [route: {s}->https://<host>.amazonaws.com:443][total kept alive: 4; route allocated: 4 of 10; total allocated: 4 of 30]
Jul 13, 2020 @ 17:52:36.939 16:52:36.939 [Classification task cleanup] DEBUG org.apache.http.impl.nio.client.InternalHttpAsyncClient - [exchange: 240590] Request connection for {s}->https://<host>.amazonaws.com:443
Jul 13, 2020 @ 17:52:36.939 16:52:36.939 [Classification task cleanup] DEBUG org.apache.http.impl.nio.client.MainClientExec - [exchange: 240590] start execution
Jul 13, 2020 @ 17:52:36.939 16:52:36.939 [Classification task cleanup] DEBUG org.apache.http.client.protocol.RequestAuthCache - No credentials for preemptive authentication
Jul 13, 2020 @ 17:52:36.939 16:52:36.939 [Classification task cleanup] DEBUG org.apache.http.impl.nio.conn.ManagedNHttpClientConnectionImpl - http-outgoing-50 <host>:42854<-><host>:443[ACTIVE][r:r][ACTIVE][r][NOT_HANDSHAKING][0][0][0]: Set timeout 0
Jul 13, 2020 @ 17:52:36.939 16:52:36.939 [Classification task cleanup] DEBUG org.apache.http.impl.nio.client.InternalHttpAsyncClient - [exchange: 240590] Connection allocated: CPoolProxy{http-outgoing-50 [ACTIVE]}
Jul 13, 2020 @ 17:52:36.939 16:52:36.939 [Classification task cleanup] DEBUG org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager - Connection leased: [id: http-outgoing-50][route: {s}->https://<host>.amazonaws.com:443][total kept alive: 3; route allocated: 4 of 10; total allocated: 4 of 30]
Jul 13, 2020 @ 17:52:36.939 16:52:36.939 [Classification task cleanup] DEBUG org.apache.http.impl.nio.conn.ManagedNHttpClientConnectionImpl - http-outgoing-50 <host>:42854<-><host>:443[ACTIVE][r:r][ACTIVE][r][NOT_HANDSHAKING][0][0][0]: Set attribute http.nio.exchange-handler
Jul 13, 2020 @ 17:52:36.939 16:52:36.939 [Classification task cleanup] DEBUG org.apache.http.client.protocol.RequestAuthCache - Re-using cached 'basic' auth scheme for https://<host>.amazonaws.com:443`
Hi @cristinacivili-work,
Could you please try to set the log level to TRACE to see if there are any additional log messages coming from the classification service?
Thanks, Mark
Hi @cmark,
I managed to run a classification job and it finished successfully. However, there is no documentation about how to actually apply the changes found after that. Looking at the code I gathered I have to update the job to SAVED
state. Did that but for the last few hours now it's been stuck in SAVING_IN_PROGRESS
.
Do you know what I can do to check if it actually do something or how long is it actually supposed to take for the saving, is this normal, seems unlikely. Also what exactly should the values of the update request be, I am having trouble figuring out the assigner
one?
Thanks, Ivelina
Hi @cmark,
After experimenting with different values I managed to transition a job to SAVED state with:
this however resulted in loosing a whole lot of concepts and relationships and they are nowhere to be found now.
Can you please describe the whole complete process of classification L
Also this appears to maybe be related to the other issue I raised about the payload limit because what I see in the logs is:
Updated 1 branch documents with bulk update with script 'commit' [2020-09-11 15:58:51.094] [pool-11-thread-2] INFO index.snomedstore - Updated 35703 relationship documents with bulk update with script 'updateRevised' [2020-09-11 15:59:03.954] [pool-11-thread-3] INFO index.snomedstore - Updated 74586 concept documents with bulk update with script 'updateRevised' [2020-09-11 15:59:09.451] [I/O dispatcher 1] ERROR index.snomedstore - Failed bulk request org.elasticsearch.ElasticsearchStatusException: Unable to parse response body
followed by a whole bunch of :
Caused by: org.elasticsearch.client.ResponseException: method [POST], host [https://vpc-es-snowowl-internal-dev-qqh5rgzgvfds7z2igbkkguecn4.eu-west-1.es.amazonaws.com], URI [/snomedstore-concept/_bulk?timeout=1m], status line [HTTP/1.1 413 Request Entity Too Large] {"Message":"Request size exceeded 10485760 bytes"}
and finally :
[2020-09-11 15:59:19.412] [Worker-1: 9e70e628-d869-4fd4-a38d-8395604d7483] INFO repository.snomedStore - Changes have been successfully persisted to MAIN/2018-07-31/SNOMEDCT-UK/2019-06-01/SNOMEDCT-BABYLON@1599839851037. [2020-09-11 15:59:19.438] [Worker-1: 9e70e628-d869-4fd4-a38d-8395604d7483] INFO lock - Lock released for branch 'MAIN/2018-07-31/SNOMEDCT-UK/2019-06-01/SNOMEDCT-BABYLON' of repository 'snomedStore' (committing changes). [2020-09-11 15:59:20.456] [Worker-1: 9e70e628-d869-4fd4-a38d-8395604d7483] INFO index.snomedstore - Updated 1 classificationtask documents with bulk update with script 'saved' [2020-09-11 15:59:20.512] [Worker-1: 9e70e628-d869-4fd4-a38d-8395604d7483] INFO lock - Lock released for branch 'MAIN/2018-07-31/SNOMEDCT-UK/2019-06-01/SNOMEDCT-BABYLON' of repository 'snomedStore' (persisting ontology changes). [2020-09-11 15:59:20.552] [Worker-1: 9e70e628-d869-4fd4-a38d-8395604d7483] INFO index.jobs - Updated 1 job documents with bulk update with script 'withCompletionLevel' [2020-09-11 15:59:20.576] [Worker-1: 9e70e628-d869-4fd4-a38d-8395604d7483] INFO index.jobs - Updated 1 job documents with bulk update with script 'withDone'
which all seems like maybe the save would have been a success but the payload size limit did not allow the changes to be saved and instead the existing concepts got even messed up as well, but the whole process continued and ended misleadingly "successful"
Thanks
Hi @ivelina-yordanova,
this however resulted in loosing a whole lot of concepts and relationships and they are nowhere to be found now.
I'm sad to hear you have lost data after you have performed a save through the classification endpoint.
This unfortunately originates from the same issue, Snow Owl does not yet fully support Elasticsearch clusters where the HTTP communication request size is limited to a certain amount. Snow Owl is currently limiting all save operations to 10Mb
of bulk action items (or 10_000
bulk action items, whichever threshold reached first). This 10Mb threshold however is only honored by Elasticsearch BulkProcessor and not by other bulk requests (bulk update/delete operations). Those will limit themselves to 10_000
items. I have created a PR (#690) to allow users to configure these values in the snowowl.yml
configuration file, which might help you solve the issue by reducing the values to 5_000
and 5
(MB) respectively.
Can you please describe the whole complete process of classification L running the job after done - review the changes save/apply the changes - how? where are those actually applied?
Process step-by-step is as follows:
POST /classifications
by selecting the target branch and the classifier to use.SCHEDULED
and/or RUNNING
)GET /classifications/:classificationId/equivalent-concepts
- it is recommended (but not required) to resolve those first, then reclassify and continue with the review using the new dataGET /classifications/:classificationId/relationship-changes
- inferred (aka to be added)/redundant (aka to be removed) relationships to apply on conceptsPUT /classifications/:classificationId
request with the following JSON body to save the results{
"assigner": "extension", // there are two known open source implementation, default and extension (recommended for extensions, like yours), default uses the values configured in the request only (see next two values), while extension uses a more specific algorithm to assign namespace and module values to the newly created inferred relationships
"module": "string", // this should be your module
"namespace": "string", // this should be your namespace
"status": "SAVED"
}
SAVING_IN_PROGRESS
to either SAVE_FAILED
or SAVED
.The classifier generates new inferred relationships based on the new hierarchy information generated by the selected reasoner (eg. ELK). The new relationships will alter the inferred hierarchy in a way to match the stated definition of the concepts. They will move to a more appropriate parent if needed and new inferred information will be added and so on to the affected concepts. This information then can be used in ECL and other queries for you analytics and data retrieval needs.
I hope this clarifies the implementation of the classification process.
Let me know if you have any further questions.
Cheers, Mark
Hi all,
I'm closing this due to inactivity. Feel free to reopen if you have any new information or need help.
Cheers, Mark
Hi,
I've been trying to use the classification endpoint in Snow Owl, without any luck so far. Could you please clarify what are the expected values for the reasonerId field?
I know Snow Owl is equipped with ELK; I tried to input both "org.semanticweb.elk.elk.reasoner.factory" and "ELK 0.4.3", returned from the reasoners endpoint as extendionId and name, respectively, but it didn't work. In both cases the classification was scheduled and failed later without any error message.
Thanks in advance for your help!