[Bug] Reidex job continues indefinitely after updating thehive version.

dominiksr commented 2 years ago

Request Type

Bug

Work Environment

Question	Answer
OS version (server)	Ubuntu
Virtualized Env.	True
TheHive version / git hash	4.16.1
Package Type	DEB
Database	Cassandra 3.11.11
Index type	Elasticsearch 7.16.2
Attachments storage	Local

Problem Description

After upgrading thehive 4.14.1 to 4.16.1, thehive service does not work.

Steps to Reproduce

Update of thehive from version 4.14.1 to 4.16.1

Complementary information

I analyzed the logs. Neither elasticsearch nor cassandra shows any errors after the update. Thehive also does not show any errors but shows these logs endlessly: (...) 2021-12-20 14:10:36,185 [INFO] from org.janusgraph.graphdb.olap.job.IndexRepairJob in Thread-13 [|] Found index global2 2021-12-20 14:10:36,483 [INFO] from org.thp.scalligraph.models.Database in application-akka.actor.default-dispatcher-21 [|mgmt-05cf443a] Reindex job 4cb84cb2 is running 2021-12-20 14:10:37,483 [INFO] from org.thp.scalligraph.models.Database in application-akka.actor.default-dispatcher-21 [|mgmt-05cf443a] Reindex job 4cb84cb2 is running 2021-12-20 14:10:38,483 [INFO] from org.thp.scalligraph.models.Database in application-akka.actor.default-dispatcher-21 [|mgmt-05cf443a] Reindex job 4cb84cb2 is running 2021-12-20 14:10:39,483 [INFO] from org.thp.scalligraph.models.Database in application-akka.actor.default-dispatcher-21 [|mgmt-05cf443a] Reindex job 4cb84cb2 is running 2021-12-20 14:10:40,483 [INFO] from org.thp.scalligraph.models.Database in application-akka.actor.default-dispatcher-21 [|mgmt-05cf443a] Reindex job 4cb84cb2 is running 2021-12-20 14:10:41,484 [INFO] from org.thp.scalligraph.models.Database in application-akka.actor.default-dispatcher-21 [|mgmt-05cf443a] Reindex job 4cb84cb2 is running 2021-12-20 14:10:42,484 [INFO] from org.thp.scalligraph.models.Database in application-akka.actor.default-dispatcher-21 [|mgmt-05cf443a] Reindex job 4cb84cb2 is running 2021-12-20 14:10:43,484 [INFO] from org.thp.scalligraph.models.Database in application-akka.actor.default-dispatcher-21 [|mgmt-05cf443a] Reindex job 4cb84cb2 is running 2021-12-20 14:10:44,484 [INFO] from org.thp.scalligraph.models.Database in application-akka.actor.default-dispatcher-21 [|mgmt-05cf443a] Reindex job 4cb84cb2 is running 2021-12-20 14:10:45,264 [INFO] from org.janusgraph.graphdb.olap.job.IndexRepairJob in Thread-13 [|] Found index global2 (...)

jpferrero commented 2 years ago

Hi,

This seems similar to #2283. I was able to solved the issue.

dominiksr commented 2 years ago

Hi,

This seems similar to #2283. I was able to solved the issue.

I am not sure if this is the same problem but I will check your solution. Even if your solution works, this problem should probably be treated as a bug that needs to be fixed somehow.

dominiksr commented 2 years ago

I did everything except change the parameter. while it was trying to do the reindex I restarted cassandra and it started working ... unfortunately it did not survive the reboot. The same logs appeared so I reverted to the state with version 4.14.1.

Tyrell20 commented 2 years ago

Same error on my RHEL system after upgrading TheHive to 4.16.1.

I need to restart the Cassandra service in order to have TheHive up & running but each time the system is restarted the problem occurs again.

dominiksr commented 2 years ago

Same error on my RHEL system after upgrading TheHive to 4.16.1.

I need to restart the Cassandra service in order to have TheHive up & running but each time the system is restarted the problem occurs again.

Today I found out that this is probably due to an issue related to large size data. I came across this link today by accident, which is a shame because I've tried to update several times. https://blog.strangebee.com/thehive-4-1-16-is-out/

I'm holding off on updating for the moment.

ysebaf commented 2 years ago

Hi, Same error on Ubuntu Server system and Lucene index. Upgrading to 4.1.16 or 4.1.17 from 4.1.14 gives the same issue. I needed to restore the VM to the preupdate state to get it back online after waiting the reindex job more than 30 minutes (~70k Cases, 12 vcpu and 16GB of RAM)

TheHive-Project / TheHive