jeff1evesque / mongodb-cluster

puppet module to provision mongodb cluster
1 stars 0 forks source link

Create 'shard.pp' to provision mongodb shard(s) #5

Open jeff1evesque opened 7 years ago

jeff1evesque commented 7 years ago

We need to create shard.pp, which will contain the following logic:

jeff1evesque commented 7 years ago

426e608: the current code generates the following logs, immediately after vagrant up:

vagrant@galileo:~$ cat /var/log/mongodb/mongod.log
2017-03-29T07:51:08.262-0400 [initandlisten] MongoDB starting : pid=2814 port=27017 dbpath=/var/lib/mongodb 64-bit host=galileo
2017-03-29T07:51:08.262-0400 [initandlisten] db version v2.6.12
2017-03-29T07:51:08.262-0400 [initandlisten] git version: d73c92b1c85703828b55c2916a5dd4ad46535f6a
2017-03-29T07:51:08.262-0400 [initandlisten] build info: Linux build5.ny.cbi.10gen.cc 2.6.32-431.3.1.el6.x86_64 #1 SMP Fri Jan 3 21:39:27 UTC 2014 x86_64 BOOST_LIB_VERSION=1_49
2017-03-29T07:51:08.262-0400 [initandlisten] allocator: tcmalloc
2017-03-29T07:51:08.262-0400 [initandlisten] options: { config: "/etc/mongod.conf", net: { bindIp: "127.0.0.1" }, storage: { dbPath: "/var/lib/mongodb" }, systemLog: { destination: "file", logAppend: true, path: "/var/log/mongodb/mongod.log" } }
2017-03-29T07:51:08.412-0400 [initandlisten] journal dir=/var/lib/mongodb/journal
2017-03-29T07:51:08.412-0400 [initandlisten] recover : no journal files present, no recovery needed
2017-03-29T07:51:09.101-0400 [initandlisten] preallocateIsFaster=true 7.22
2017-03-29T07:51:12.108-0400 [initandlisten] preallocateIsFaster=true 4.64
2017-03-29T07:51:12.241-0400 [signalProcessingThread] got signal 15 (Terminated), will terminate after current cmd ends
2017-03-29T07:51:12.241-0400 [signalProcessingThread] now exiting
2017-03-29T07:51:12.241-0400 [signalProcessingThread] dbexit:
2017-03-29T07:51:12.241-0400 [signalProcessingThread] shutdown: going to close listening sockets...
2017-03-29T07:51:12.241-0400 [signalProcessingThread] shutdown: going to flush diaglog...
2017-03-29T07:51:12.241-0400 [signalProcessingThread] shutdown: going to close sockets...
2017-03-29T07:51:12.241-0400 [signalProcessingThread] shutdown: waiting for fs preallocator...
2017-03-29T07:51:12.241-0400 [signalProcessingThread] shutdown: lock for final commit...
2017-03-29T07:51:12.241-0400 [signalProcessingThread] shutdown: final commit...
2017-03-29T07:51:12.241-0400 [signalProcessingThread] shutdown: closing all files...
2017-03-29T07:51:12.241-0400 [signalProcessingThread] closeAllFiles() finished
2017-03-29T07:51:12.241-0400 [signalProcessingThread] journalCleanup...
2017-03-29T07:51:12.241-0400 [signalProcessingThread] removeJournalFiles
2017-03-29T07:51:12.245-0400 [signalProcessingThread] shutdown: removing fs lock...
2017-03-29T07:51:12.245-0400 [signalProcessingThread] dbexit: really exiting now

Yet, we have the following configurations defined:

vagrant@galileo:~$ cat /etc/mongod.conf
# mongodb.conf - generated from Puppet

#where to log
logpath=/var/log/mongodb/mongodb.log
logappend=true
# Set this option to configure the mongod or mongos process to bind to and
# listen for connections from applications on this address.
# You may concatenate a list of comma separated values to bind mongod to multiple IP addresses.
bind_ip = 0.0.0.0
port = 27019
dbpath=/data/db
# location of pidfile
pidfilepath=/var/run/mongod.pid
# Turn on/off security.  Off is currently the default
noauth=true
# Verbose logging output.
verbose = true
# Is the mongod instance a configuration server
configsvr = true
# Configure ReplicaSet membership
replSet = csrs
# Use a smaller default data file size.
smallfiles = true
jeff1evesque commented 7 years ago

Given the following yaml configuration:

mongodb_node:
    storage:
        engine: mmapv1
        dbPath:
            - '/data'
            - '/data/db'
        journal:
            enabled: true
        mmapv1:
            smallFiles: true
    systemLog:
        verbosity: 1
        destination: file
        logAppend: true
        systemLogPath: '/var/log/mongodb/mongod.log'
    net:
        port: 27019
        bindIp: 192.168.0.36
    processManagement:
        fork: true
        pidfilepath: '/var/run/mongodb/mongod.pid'
    replication:
        replSetName: 'csrs'
    sharding:
        clusterRole: 'configsvr'

Our above committed code, produces the following /etc/mongod.conf:

## mongodb.conf, this file is enforced by puppet.

## where and how to store data.
storage:
  dbPath: /data/db
  journal:
    enabled: true
  engine: mmapv1

  mmapv1:
    smallFiles: true

## where to write logging data.
systemLog:
  verbosity: 1
  destination: file
  logAppend: true
  path: /var/log/mongodb/mongod.log

## network interfaces
net:
  port: 27019
  bindIp: 192.168.0.36

## mongodb process
processManagement:
  fork: true
  pidfilepath: true

## replication
replication:
  replSetName: csrs

## sharding
sharding:
  clusterRole: configsvr
jeff1evesque commented 7 years ago

We can pass arguments to the mongo command as follows:

vagrant@galileo:~$ mongo --eval 'sh.enableSharding("svm_dataset")'
MongoDB shell version v3.4.3
connecting to: mongodb://127.0.0.1:27017
MongoDB server version: 3.4.3
2017-04-01T21:47:53.318-0400 E QUERY    [thread1] Error: not connected to a mongos :
sh._checkMongos@src/mongo/shell/utils_sh.js:8:15
sh._adminCommand@src/mongo/shell/utils_sh.js:18:9
sh.enableSharding@src/mongo/shell/utils_sh.js:98:12
@(shell eval):1:1

Note: the above command failed, since we hadn't must be connected to a mongos associated to the target sharded cluster.

jeff1evesque commented 7 years ago

We need to follow the "deploy shard cluster" instructions, to create our corresponding sharded mongodb replicaset instances.

jeff1evesque commented 7 years ago

When we try to start our mongod with our custom configuration, we get the following error:

vagrant@galileo:~$ mongod --config /etc/mongod.conf
Unrecognized option: processManagement.pidfilepath
try 'mongod --help' for more information
jeff1evesque commented 7 years ago

bd2e311: we are now able to start a mongod process:

## fails when sudoless
vagrant@galileo:~$ mongod --config /etc/mongod.conf
about to fork child process, waiting until server is ready for connections.
forked process: 1926
ERROR: child process failed, exited with error number 1
## succeeds with sudo
vagrant@galileo:~$ sudo mongod --config /etc/mongod.conf
about to fork child process, waiting until server is ready for connections.
forked process: 1930

child process started successfully, parent exiting
jeff1evesque commented 7 years ago

We were able to connect to galileo.mongodb.com, while failing with our other two configsrvs:

vagrant@galileo:/etc/init$ sudo mongo --host copernicus.mongodb.com --port 27017
MongoDB shell version v3.4.3
connecting to: mongodb://copernicus.mongodb.com:27017/
2017-04-03T22:08:47.446-0400 E QUERY    [thread1] Error: network error while attempting to run command 'isMaster' on host 'copernicus.mongodb.com:27017'  :
connect@src/mongo/shell/mongo.js:237:13
@(connect):1:6
exception: connect failed
vagrant@galileo:/etc/init$ sudo mongo --host kepler.mongodb.com --port 27017
MongoDB shell version v3.4.3
connecting to: mongodb://kepler.mongodb.com:27017/
2017-04-03T22:08:58.706-0400 E QUERY    [thread1] Error: network error while attempting to run command 'isMaster' on host 'kepler.mongodb.com:27017'  :
connect@src/mongo/shell/mongo.js:237:13
@(connect):1:6
exception: connect failed
vagrant@galileo:/etc/init$ sudo mongo --host galileo.mongodb.com --port 27017
MongoDB shell version v3.4.3
connecting to: mongodb://galileo.mongodb.com:27017/
MongoDB server version: 3.4.3
Server has startup warnings:
2017-04-03T21:16:01.155-0400 I CONTROL  [initandlisten]
2017-04-03T21:16:01.155-0400 I CONTROL  [initandlisten] ** WARNING: Access control is not enabled for the database.
2017-04-03T21:16:01.155-0400 I CONTROL  [initandlisten] **          Read and write access to data and configuration is unrestricted.
2017-04-03T21:16:01.155-0400 I CONTROL  [initandlisten] ** WARNING: You are running this process as the root user, which is not recommended.
2017-04-03T21:16:01.155-0400 I CONTROL  [initandlisten]
2017-04-03T21:16:01.155-0400 I CONTROL  [initandlisten]
2017-04-03T21:16:01.155-0400 I CONTROL  [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/enabled is 'always'.
2017-04-03T21:16:01.155-0400 I CONTROL  [initandlisten] **        We suggest setting it to 'never'
2017-04-03T21:16:01.155-0400 I CONTROL  [initandlisten]
2017-04-03T21:16:01.155-0400 I CONTROL  [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/defrag is 'always'.
2017-04-03T21:16:01.155-0400 I CONTROL  [initandlisten] **        We suggest setting it to 'never'
2017-04-03T21:16:01.155-0400 I CONTROL  [initandlisten]
>

The difference between the three environments was that galileo was created with 1024MB, while the other two environments was created with 384MB.

jeff1evesque commented 7 years ago

We still need to perform the following:

jeff1evesque commented 7 years ago

We could add an additional onlyif property to the following snippet in shard.pp:

    ## add shards to the cluster
    $replset.each |String $type, $set| {
        if ($replset != 'csrs') {
            $set.each|String $host|
                exec { "add-${replset}-${host}":
                    command  => "mongo --host ${initiate_ip} --port ${initiate_port} --eval 'sh.addShard(\"${replset}/${host}.mongodb.com:27018\");'",
                    onlyif   => [
                        "mongo --host ${initiate_ip} --port ${initiate_port} --quiet --eval 'quit();'",
                        "mongo --host ${initiate_ip} --port ${initiate_port} --quiet --eval 'sh.status()');",
                    ],
                    path     => '/usr/bin',
                }
            }
        }
    }

Specifically, we'll need to parse the result of the sh.status() command, via some variation of vagrant@kepler:~$ sudo mongo --host 192.168.0.36 --port 27019 -- eval 'sh.status();'. This can be done via bash, by parsing the contents between two delimiters:

For example, given the following output, we can check for content between the above delimiters:

mongos> sh.status()
--- Sharding Status --- 
sharding version: {
"_id" : 1,
"version" : 4,
"minCompatibleVersion" : 4,
"currentVersion" : 5,
"clusterId" : ObjectId("548eb941260fb6e98e17d275")
}
shards:
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
jeff1evesque commented 7 years ago

We could test some variation of sed:

start="test1"; end="test2"; sed -n "/$start/,/$end/{/$start/b;/$end/b;p}" filename