[Bug] Hadoop HDFS HA not compatible with TheHive

Request Type

Bug

Work Environment

Question	Answer
OS version (server)	Linux Suse Enterprise
OS version (client)	Windows 11, ...
Virtualized Env.	False
Dedicated RAM	32 GB
vCPU	8
TheHive version / git hash	4.1.22.1
Package Type	Binary
Database	Cassandra
Index type	Elasticsearch
Attachments storage	HDFS

Problem Description

I wanted to use HDFS to store the attachment data, I have a Cluster of 2 servers for TheHive. I configured the HDFS Cluster using https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/ClusterSetup.html and I give the value "root: hdfs://serverprod-01:8020" and "root: hdfs://serverprod-02:8020"

Everthing works fine, but what If Java Process of serverprod-01 crash (namenode1)?, serverprod-01 will try to access to storage "hdfs://serverprod-01:8020" but will not work and the availability of the storage will be broken.

Then I configured a HDFS High Availability. https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithNFS.html or https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html

Here we dont have anymore a namenode:port in our core-site.xml file, we will have a nameservice hdfs://thehive

and when we configure that in application.conf and restart TheHive.....

People are working 24/7 with TheHive and there are also other integrations where Cases are automatically created, we should have High Availability of the system for all the data, therefore TheHive should be able to manage a nameservice from Hadoop to keep this High Availability of the file system.

Thanks for your time and your answers.

Steps to Reproduce

Explained in the Problem Description.

Possible Solutions

For Spark (I think is also Scala Program) I found this solution: https://mungeol-heo.blogspot.com/2016/12/accessing-remote-ha-enabled-hdfs.html and this one https://itecnote.com/tecnote/apache-spark-how-to-access-hdfs-by-uri-consisting-of-h-a-namenodes-in-spark-which-is-outer-hadoop-cluster/ Nothing for TheHive.

TheHive-Project / TheHive