radical-cybertools / radical.benchmark

Use RCT to benchmark HTC application on HPC resources
MIT License
0 stars 0 forks source link

Deploy MongoDB at ORNL via OpenShift #1

Open mturilli opened 7 years ago

mturilli commented 7 years ago

Deployed a test image (ephemeral) via OpenShift. I did the following:

Endpoint as reported by OpenShift dashboard and the oc (CLI for OpenShift) executed from the DTN on which I sshed:

mongodb://radical:<pswd>@mongodb/htcbenchmark (as reported by openshift)
$ oc status
In project radical-benchmark on server https://openshift.ccs.ornl.gov:8443

svc/mongodb - 172.31.252.169:27017
  dc/mongodb deploys openshift/mongodb:2.6
    deployment #1 deployed 10 minutes ago - 1 pod
screen shot 2017-09-13 at 11 00 31 am nopswds
mturilli commented 7 years ago

Added merzky1 with role admin to radical-benchmark project in OpenShift. This should give Andre the right to access the project and use the oc command to create a port forward. Further, this should enable Andre also to deploy new containers within that project.

mturilli commented 7 years ago

This does not work because at the moment we cannot access the container/mongodb server from Titan's headnode and compute nodes. I wrote to Jason (the person in charge of OpenShift at ORNL) asking for help.

mturilli commented 7 years ago

Installed MongoDB on a DTN node by download generic linux binaries from https://fastdl.mongodb.org/linux/mongodb-linux-x86_64-2.6.12.tgz

Unfortunately, port 27017 is filtered out from both Titan's headnode and compute nodes also in the DTNs.

mturilli commented 7 years ago

================================================================================ Getting Started (RP version 0.46.2)

new session: [rp.session.titan-ext7.mturilli1.017428.0000] \ database : [mongodb://lgn:pswd@mongodb-radical-benchmark.apps.ccs.ornl.gov:80/htcbenchmark] err Traceback (most recent call last): File "./00_getting_started.py", line 36, in session = rp.Session() File "/autofs/nccs-svm1_home1/mturilli1/ve/test/lib/python2.7/site-packages/radical/pilot/session.py", line 264, in init % (dburl, ex))
RuntimeError: Couldn't create new session (database URL 'mongodb://radical:2r4d1c4l@mongodb-radical-benchmark.apps.ccs.ornl.gov:80/htcbenchmark' incorrect?): ids don't match -1462843969 808465440


Googled the exception and found that, usually, it is thrown when a process reads the answer to a request made by another process. Any idea?
mturilli commented 7 years ago

I had a chat with @itomaldonado about how this could be caused by header rewriting and sent Jason an email explaining the issue, asking for guidance.

andre-merzky commented 7 years ago

Alas, I don't think I have seen that specific error before - lets see what Jason answers. Thanks for involving him.

mturilli commented 7 years ago

Jason changed our service over to a Type: NodePort and moved the actual port assignment to a port that is allowed through the firewall from Titan to the OpenShift Dev cluster.

The MongoDB endpoint is mongodb://lgn:pswd@openshift.ccs.ornl.gov:30008/htcbenchmark

Testing.

mturilli commented 7 years ago

Test successful:

$ ./00_getting_started.py ornl.titan_aprun

================================================================================
 Getting Started (RP version 0.47)                                              
================================================================================

new session: [rp.session.titan-ext6.mturilli1.017428.0003]                     \
database   : [mongodb://lgn:pswd@mongodb-radical-benchmark.apps.ccs.ornl.gov:30008/htcbenchmark]
        ok
read config                                                                   ok

--------------------------------------------------------------------------------
submit pilots                                                                   

create pilot manager                                                          ok
create pilot description [ornl.titan_aprun:64]                                ok
submit 1 pilot(s)
        .                                                                     ok

--------------------------------------------------------------------------------
submit units                                                                    

create unit manager                                                           ok
add 1 pilot(s)                                                                ok
create 5 unit description(s)
        .....                                                                 ok
submit 5 unit(s)
        .....                                                                 ok

--------------------------------------------------------------------------------
gather results                                                                  

wait for 5 unit(s)
        +++++                                                                 ok

--------------------------------------------------------------------------------
finalize                                                                        

closing session rp.session.titan-ext6.mturilli1.017428.0003                    \
close unit manager                                                            ok
close pilot manager                                                            \
wait for 1 pilot(s)
                                                                         timeout
                                                                              ok
+ rp.session.titan-ext6.mturilli1.017428.0003 (json)
+ pilot.0000 (profiles)
+ pilot.0000 (logfiles)
session lifetime: 135.5s                                                      ok

--------------------------------------------------------------------------------
mturilli commented 7 years ago

Based on Andre's report, we seem to be hitting a bottleneck with the mongodb deployed at ORNL. I tried to:

I wrote Jason asking whether we can: