External Service on LUH cluster

npav commented 7 years ago

@antoine-tran So, as mentioned in previous e-mails/issue posts, the external service is a jar file that needs to be running at the same time as the pipelines, and acts as an intermediary between pipelines and stakeholder applications. The applications send commands to the external service, which translates them to QM-pipeline commands, and directly communicates with the infrastructure to execute them. The sinks of the pipelines also connect on the external service, and send results to it. The external service then chooses which of those results to forward to each stakeholder application depending on active subscriptions, etc.

In order for the external service to run, it needs two "dependencies", namely

the infrastructure configuration (i.e. qm.infrastructure.cfg - the same one on the cluster where the pipelines are executed)
the external service properties file (i.e. external-service.properties - the same one that the sinks "read")

Some details about bot files:

qm.infrastructure.cfg : I am not sure where it is located on your cluster, but you need to make sure that this file on the cluster is the same file that is passed as an argument to the external service, so it can connect to the infrastructure on your cluster and send commands. If you want to run the external service on your machine and not on the cluster, I guess you can have a copy of this file on your machine, but you need to makes sure that this file is exactly the same as the one on the cluster at all times.
external-service.properties : Again, not sure where this is stored on your cluster, but (at least as it is now), the external service does not directly read this file, yet it gets its path from the DataManagementLayer (DML). Here is the respective code segment. If you want to run the external service on your machine and not on the cluster, there is no way to define the location of this file at the moment, because as I mentioned, it is set by the DML, and the code reads it using a call to the DML, so since the DML will be on the cluster and not your machine, it cannot access it. I can modify the code to pass it as an argument, but I don't think it is a good solution to introduce yet another duplicate file that you need to maintain in both the cluster and your machine.

All in all, I understand that your cluster is "hidden" behind a firewall, yet I am not sure it is a viable solution to have the external service running on a machine detached from the cluster. Maybe you should discuss it with Holger a bit as he has a more clear view of how the cluster is organized and how the interactions with the DML are done.

antoine-tran commented 7 years ago

@npav So before we get into the deeper discussion, I just want to know: Whether the external service is also needed for the other pipelines or just TransferPip ? Because if it causes too many hassles, we can let FocusPip (and TimeTravel if the JVM is fixed) run in LUH cluster and TransferPip run in TSI

eichelbe commented 7 years ago

As there is no NFS and the laptop running the tunnel (as far as I understand) is not part of the cluster, we will have to duplicate the files and keep them more or less in sync, in particular for the settings of the external service. So please take the qm.infrastructure.cfg from the cluster (/home/storm/qmscripts) but do not modify it on the cluster. If required, you may of course change the local location of the external properties file. And please pass me back the external properties file.

eichelbe commented 7 years ago

I fear that all pipelines that shall be accessible to Stephan need the external service. And as far as I got Claudia, she wants to have the LUH cluster as backup if something serious happens to the TSI ones, some form of incident or so (you know)

antoine-tran commented 7 years ago

@npav @eichelbe : Okay then. Will the files (or the values therein) be changed when the pipelines are running? If not, I can make a sync script and do it one time before every component starts off

eichelbe commented 7 years ago

If you have access to the cluster from your machine, then a sync script would be nice. I received the external services file and will place it on the cluster. No problem if values are changing, we just have to re-distribute it as there is no NFS ;)

eichelbe commented 7 years ago

Changed and distributed...

eichelbe commented 7 years ago

No change at runtime if your components do not change this file, the infrastructure only reads it ;)

npav commented 7 years ago

What about passing the external-service.properties to the external service? This is currently done using the DML. Since there will be no DML on Tuan's machine, should I modify the source and add another optional arg? Or maybe another approach is to check in the same location where the jar is being executed first. If the file is not found there, then it can check the DML. So this way, Tuan only needs to have the file in the same directory as the jar (which makes sense anyway on his machine).

antoine-tran commented 7 years ago

@npav I already did it. That is why we had the external service running :)

antoine-tran commented 7 years ago

@eichelbe Holger, could you point me to some documentation, or even better, explain me very briefly where, in the Adaptation layer, can I find the code to dispatch commands to the pipeline component ?

I'm having the feeling that without modifying (or at least creating a new method in ClientEndpoint), the external service will not be able to send commands to the cluster. Current code heavily relies on the IP address of the adaptationHost, which is public in TSI and not in ours.

eichelbe commented 7 years ago

We shall discuss this first. Who is using the ClientEndpoint - either QM-IConf (should be able to connect to the cluster via plink, i.e., a tunnel) or the external service running on your machine, which shall have access to the cluster, or?

antoine-tran commented 7 years ago

So, the external service (classes DataHandler's in QM-ExternalService) creates a ClientEndpoint instance and calls its schedule() method to dispatch the message (to the event bus, I guess).

eichelbe commented 7 years ago

Right, this is basically ok. Can you directly access master02/10.10.0.2 from your machine?

antoine-tran commented 7 years ago

@eichelbe Not sure what you mean, but I can only access to hadoop2 / hadoop3 (master02 in your term ?) via ssh. If want to issue some commands to HDFS, I set up a passwordless ssh and pipeline the commands.

This is what you want to know or I've understood wrong ?

eichelbe commented 7 years ago

However port 8080 goes through, it seems that the specific port that we need is also not accessible from zerberus. Some firewall seems to block this request. What was your idea?

antoine-tran commented 7 years ago

I think I figured out the way it works now. The code I wanted to look at is ServerEndpoint, which creates the TCP server in the cluster and dispatches all commands (sockets) from the external service to the pipeline. Did I understand right?

So my idea was like this: I decided to set up the external service on my machine, listening on port 80 from the consumer and redirecting to hadoop2. Now I need to specify the adaptation port (14001 as already set in .cfg file) and set the adaptation host. Since the external service is detached from the cluster, I need to set up the adaptation host (which is now empty).

I will try with "hadoop2.kbs.uni-hannover.de" for the adaptation host and see how it works.

eichelbe commented 7 years ago

Right, the ServerEndpoint is handling the sockets. Is this some kind of bi-directional forwarding? Ok, could work.

Hmm.. why port 14001 - master02 is listening on 7012.

antoine-tran commented 7 years ago

I just saw it in qm.infrastructure.cfg, maybe this is the setting for TSI cluster and I need to change to 7012 for our cluster.

eichelbe commented 7 years ago

On LUH no specific port was configured, the infrastructure is just using its built-in default ;) Might be that you took the TSI cfg file somehow...

antoine-tran commented 7 years ago

You're right, I made the mistake. This could (hopefully) explain the other bug with the Hbase missing configured value

eichelbe commented 7 years ago

We are all not free of mistakes ;) Crossing my fingers that we get the two issues done soon.

eichelbe commented 7 years ago

@antoine-tran May I kill the TransferPip/restart the infrastructure?

antoine-tran commented 7 years ago

@npav Hi Nick, When testing the external service using the PriorityFinancialPip, I saw this line of code in PriorityDataSinkForFinancialAndTwitter (in hy-priority-data-sink-3.1-SNAPSHOT), line 35:

static { DataManagementConfiguration.configure(new File("/var/nfs/qm/qm.infrastructure.cfg")); }

Will it be the cause of the error with the missing configuration file in our cluster ?

npav commented 7 years ago

@antoine-tran Hi Tuan. No, the configuration in your cluster is passed differently to the workers, so even if the sink does not find this file, no problem is caused (the exception is internally consumed and ignored).

QualiMaster / qm-issues

External Service on LUH cluster #63