TheHive-Project / TheHive

TheHive: a Scalable, Open Source and Free Security Incident Response Platform
https://thehive-project.org
GNU Affero General Public License v3.0
3.39k stars 618 forks source link

[Bug] 'Found duplicate entities' while migrating theHive from 3.4.0-1 to 4.1.17-1 #2341

Closed packetvitality closed 2 years ago

packetvitality commented 2 years ago

Request Type

Bug

Work Environment

Question Answer
OS version (server) Docker https://hub.docker.com/r/thehiveproject/thehive4
Virtualized Env. True
Dedicated RAM 16 GB
vCPU 8
TheHive version / git hash 4.1.17-1
Package Type Docker
Database Cassandra
Index type Lucene
Attachments storage MinIO

Problem Description

I used the migration tool to move from theHive 3.4.0-1 to 4.1.17-1. Upon starting theHive after the migration, I noted many logs indicating duplications, for example:

[info] o.t.t.s.ImpactStatusIntegrityCheckOps [|1012ed57] Found duplicate entities:

[info] o.t.t.s.ObservableTypeIntegrityCheckOps [|1383fe4a] Found duplicate entities:

[info] o.t.t.s.ResolutionStatusIntegrityCheckOps [|4908101d] Found duplicate entities:

Within the GUI, I can notice the result of these duplicates in the observables. The screenshot below shows duplicate selection options for when new observables.

image

Steps to Reproduce

Start a clean version of theHive with no data, allow it to come up, creating database schemas, etc.

Stop theHive docker container.

Start theHive in docker without theHive service running by adding the following to my docker-compose file: entrypoint: sleep infinity

Enter theHive as root docker exec -it --workdir /root --user root thehive bash

Copy log file cp /opt/thehive/conf/logback-migration.xml /etc/thehive/

Start the migration tool

/opt/thehive/bin/migrate \
  --output /etc/thehive/application.conf \
  --main-organisation [org] \
  --es-uri http://[ip]:9200 \
  --es-index the_hive \
  --es-single-type true

I referenced other issues and tried with and without the --es-single-type true option.

I read through similar issues 2331 , 2333, and 2334 but I am unsure how to resolve.

thehive_migration_duplication_logs.txt

packetvitality commented 2 years ago

Re-tried using the latest version of the docker container docker pull thehiveproject/thehive4:latest

REPOSITORY                     TAG       IMAGE ID       CREATED         SIZE
thehiveproject/thehive4        latest    7183a3524059   3 days ago      722MB

I am no longer seeing the 'Found duplicate entities' logs, but I do still see duplicate observable options. Screenshot below. image

packetvitality commented 2 years ago

I was able to resolve this by adjusting my options to drop the database while running the migration tool as follows:

/opt/thehive/bin/migrate \
  --drop-database \
  --output /etc/thehive/application.conf \
  --main-organisation [org]\
  --es-uri http://[ip]:9200 \
  --es-index the_hive \
  --es-single-type true

Prior to running the command above I also had to adjust my docker-compose file to mount the parent directory for the index folder. This allows the migration tool to delete the index folder. ./vol/thehive/opt/thp/thehive:/opt/thp/thehive instead of: ./vol/thehive/index:/opt/thp/thehive/index

Once the tool was finished, I modified the permissions on my host to ensure all of the files created when running the tool could be accessed when running as thehive user. The better approach may have been to just run the tool as thehive user, but I am not sure if the tool needed to be ran as root or not. chown -R 1000:1000 ./vol/thehive/opt/thp/thehive