TheHive-Project / TheHive

TheHive: a Scalable, Open Source and Free Security Incident Response Platform
https://thehive-project.org
GNU Affero General Public License v3.0
3.45k stars 626 forks source link

[Bug] Problem during migration from 3.3.1 to 3.4 #1920

Open dadokkio opened 3 years ago

dadokkio commented 3 years ago

I was helping trying to migrate production data from thehive 3.3.1 to 3.4.

Initial state of thehive was the following:

Thehive: 3.3.1-1
Elastic4Play: 1.10.0
Play: 2.6.21
Elastic4s: 5.6.6
ElasticSearch: 5.6.9

Status of original elastic index:

>> curl 'localhost:9200/_cat/indices?v'
health   status   index        uuid   pri   rep   docs.count   docs.deleted   store.size   pri.store.size
yellow   open     the_hive_14  xxxx   5     1     49343        141            230.4mb      230.4mb

Following migration guide we modified application.conf changing search.host to the new search.uri. We tried both installing thehive 3.4 from sources and also from .deb package.

At the first run thehive ask to update the db, but after confirming the update it stuck and returns the following error:

unknown4

A new empty index is created and the old one is closed:

>> curl 'localhosto:9200/_cat/indices?v'
health   status   index        uuid   pri   rep   docs.count   docs.deleted   store.size   pri.store.size
         close    the_hive_14
red      open     the_hive_15  yyyy   5     1     0            0              230b          230b

Returning to thehive homepage returns a lot of popup with errors.

What we tried:

What we don't tried

Any help or suggestion?

nadouani commented 3 years ago

The second screenshot is showing No configuration setting found for search.uri. Are you sure the config is there?

nadouani commented 3 years ago

There is also a step by step migration doc: https://github.com/TheHive-Project/TheHiveDocs/blob/master/admin/upgrade_to_thehive_3_4_and_es_6_x.md

Useful options: https://github.com/StrangeBeeCorp/Notebooks

dadokkio commented 3 years ago

Yes, the configuration is there, I've no idea why manually that doesn't work. We tried with different conf files (also one with just search.uri) but we received always error. Steps in the guide has been followed, we just arrive at "Ensure everything is working." but i'ts not.

Some quick questions.. it's ok that the_hive_14 is in close status? If I wan to retry migration it's ok to just remove the_hive_15 and retry from gui? Or we need to bring the_hive_14 back to yellow somehow? It's the size of the index too big? could that be related to timeouterror?

Some ot. the repo has just 3.5 version, so apt upgrade will move to 3.5 skipping 3.4.. we had to find alternative 3.4 deb package

In any case thanks for the notebook! I'll give that a look.

nadouani commented 3 years ago

I don't know what the "closed" status means. I guess you need it open to be able to migrate from it.

It's the size of the index too big? could that be related to timeouterror?

I don't think so

For the packages:

dadokkio commented 3 years ago

Ok, we reopened the index and delete the the_hive_15 one. Restarting thehive with proper configuration we do have same behavior. After the "Update Database" button "UserMgmtCtrl: java.net.SocketTimeoutException". In this case the old index is still yellow:

>> curl 'localhosto:9200/_cat/indices?v'
health   status   index        uuid   pri   rep   docs.count   docs.deleted   store.size   pri.store.size
yellow   open     the_hive_14  xxxx   5     1     49343        141            230.4mb      230.4mb
red      open     the_hive_15  yyyy   5     1    
nadouani commented 3 years ago

Yellow means you have 1 node of ES which is OK.

After the "Update Database" button "UserMgmtCtrl: java.net.SocketTimeoutException".

Ok but what happened during the "Update Database"? any logs? What did timeout? Is it TheHive not able to reach ES? or something else?

LikaSvoykina commented 3 years ago

Yellow means you have 1 node of ES which is OK.

After the "Update Database" button "UserMgmtCtrl: java.net.SocketTimeoutException".

Ok but what happened during the "Update Database"? any logs? What did timeout? Is it TheHive not able to reach ES? or something else?

We have not any logs. We tried to add keepalive = 1m connectTimeout = 50000 But it don't work , the same error. Logs in path /var/log/thehive/application.log didn't update

LikaSvoykina commented 3 years ago

Is it TheHive not able to reach ES I think TheHive is able to reach. how to check it?

dadokkio commented 3 years ago

Some updates:

Still receiving error on migrate post command:

2021-04-06 11:16:31,818 [INFO] from org.elastic4play.services.MigrationSrv in application-akka.actor.default-dispatcher-18 - Initiate database migration from version 14 (indexWithMappingTypes)
2021-04-06 11:16:31,819 [INFO] from org.elastic4play.services.MigrationSrv in application-akka.actor.default-dispatcher-18 - Migrate database from version 14, add operations for version 15
2021-04-06 11:17:02,018 [ERROR] from org.elastic4play.services.MigrationSrv in application-akka.actor.default-dispatcher-15 - Migration fail
com.sksamuel.elastic4s.http.JavaClientExceptionWrapper: java.net.SocketTimeoutException
        at com.sksamuel.elastic4s.http.ElasticsearchJavaRestClient$$anon$1.onFailure(ElasticsearchJavaRestClient.scala:65)
        at org.elasticsearch.client.RestClient$FailureTrackingResponseListener.onDefinitiveFailure(RestClient.java:857)
        at org.elasticsearch.client.RestClient$1.retryIfPossible(RestClient.java:595)
        at org.elasticsearch.client.RestClient$1.failed(RestClient.java:573)
        at org.apache.http.concurrent.BasicFuture.failed(BasicFuture.java:138)
        at org.apache.http.impl.nio.client.AbstractClientExchangeHandler.failed(AbstractClientExchangeHandler.java:419)
        at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.timeout(HttpAsyncRequestExecutor.java:375)
        at org.apache.http.impl.nio.client.InternalIODispatch.onTimeout(InternalIODispatch.java:92)
        at org.apache.http.impl.nio.client.InternalIODispatch.onTimeout(InternalIODispatch.java:39)
        at org.apache.http.impl.nio.reactor.AbstractIODispatch.timeout(AbstractIODispatch.java:175)
        at org.apache.http.impl.nio.reactor.BaseIOReactor.sessionTimedOut(BaseIOReactor.java:263)
        at org.apache.http.impl.nio.reactor.AbstractIOReactor.timeoutCheck(AbstractIOReactor.java:492)
        at org.apache.http.impl.nio.reactor.BaseIOReactor.validate(BaseIOReactor.java:213)
        at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:280)
        at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
        at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:588)
        at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.net.SocketTimeoutException: null
        ... 11 common frames omitted
2021-04-06 11:17:02,019 [INFO] from org.elastic4play.ErrorHandler in application-akka.actor.default-dispatcher-15 - POST https://xx.xx.xx.xx:9443/api/maintenance/migrate returned 500
com.sksamuel.elastic4s.http.JavaClientExceptionWrapper: java.net.SocketTimeoutException
        at com.sksamuel.elastic4s.http.ElasticsearchJavaRestClient$$anon$1.onFailure(ElasticsearchJavaRestClient.scala:65)
        at org.elasticsearch.client.RestClient$FailureTrackingResponseListener.onDefinitiveFailure(RestClient.java:857)
        at org.elasticsearch.client.RestClient$1.retryIfPossible(RestClient.java:595)
        at org.elasticsearch.client.RestClient$1.failed(RestClient.java:573)
        at org.apache.http.concurrent.BasicFuture.failed(BasicFuture.java:138)
        at org.apache.http.impl.nio.client.AbstractClientExchangeHandler.failed(AbstractClientExchangeHandler.java:419)
        at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.timeout(HttpAsyncRequestExecutor.java:375)
        at org.apache.http.impl.nio.client.InternalIODispatch.onTimeout(InternalIODispatch.java:92)
        at org.apache.http.impl.nio.client.InternalIODispatch.onTimeout(InternalIODispatch.java:39)
        at org.apache.http.impl.nio.reactor.AbstractIODispatch.timeout(AbstractIODispatch.java:175)
        at org.apache.http.impl.nio.reactor.BaseIOReactor.sessionTimedOut(BaseIOReactor.java:263)
        at org.apache.http.impl.nio.reactor.AbstractIOReactor.timeoutCheck(AbstractIOReactor.java:492)
        at org.apache.http.impl.nio.reactor.BaseIOReactor.validate(BaseIOReactor.java:213)
        at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:280)
        at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
        at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:588)
        at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.net.SocketTimeoutException: null
        ... 11 common frames omitted
2021-04-06 11:17:02,022 [WARN] from akka.http.impl.engine.http2.Http2ServerDemux in application-akka.actor.default-dispatcher-2 - handleOutgoingEnded received unexpectedly in state Closed. This indicates a bug in Akka HTTP, please report it to the issue tracker.
2021-04-06 11:17:02,382 [ERROR] from org.elastic4play.database.DBConfiguration in application-akka.actor.default-dispatcher-18 - ElasticSearch request failure: POST:/the_hive_15/_search?
StringEntity({"query":{"match":{"relations":{"query":"user"}}},"size":0},Some(application/json))
 => ElasticError(search_phase_execution_exception,all shards failed,None,None,None,List(),None)