Closed balston closed 2 years ago
Downloaded the MathLM 12.2 installer and the installers for Mathematica 12.2 for Linux, Mac and Windows.
The new LM software will not run on RedHat 6.x so I'm doing a test install using my RedHat 8 test license server VM. We will have to install directly to the new UCL Wide RedHat 8 license server. Firewall rules are still being set up for the new server so there may be a delay in getting this installed.
I've finally got the 12.2 license manager now running on my RedHat 8 license server VM after a bit of fun with the new (to me) systemd boot time stuff. A change from the previous license server's SysV init scripts.
Next week I will test if the LM can serve license to Mathematica 12.2 clients.
OK so I re-booted by test license server and the Mathematica LM didn't start up correctly. However if I run the:
start-math
command as ccsplma it does start. I will need to investigate this on Monday.
The root preparation section:
cd /usr/local
mkdir Wolfram
chown ccsplma:ccsplma@ad.ucl.ac.uk Wolfram/
chmod o-rx Wolfram
cd /var/log
touch lm_mathematica.log
chown ccsplma:ccsplma@ad.ucl.ac.uk lm_mathematica.log
Open Firewall Ports:
firewall-cmd --zone=public --permanent --add-port 16286/tcp
firewall-cmd --reload
add to Puppet config
has now been done on the new RH8 license server. Next week I will install the LM ready for live testing.
I'm running the MathLM installer on the RH8 license server ...
The license manager has been installed. Note: installation instructions will be in the Change Request. Now to see if it will start up and serve licenses to Mathematica clients.
LM started as ccspapp using:
cd /usr/local/Wolfram/sbin
./start-math
Appears to be running - from the log:
lic-rhel01.ad.ucl.ac.uk - ccsplma@ad.ucl.ac.uk [08/Feb/2021:12:03:30] "MathLM 12.2 executable launched" "/usr/local/Wolfram/MathLM-12.2/mathlm" -
lic-rhel01.ad.ucl.ac.uk - ccsplma@ad.ucl.ac.uk [08/Feb/2021:12:03:30] "Verbosity level specified" "1" -
lic-rhel01.ad.ucl.ac.uk - ccsplma@ad.ucl.ac.uk [08/Feb/2021:12:03:30] "Logging verbosity level specified" "3" -
Online help is available at
http://reference.wolfram.com/network
lic-rhel01.ad.ucl.ac.uk - ccsplma@ad.ucl.ac.uk [08/Feb/2021:12:03:30] "Binding IPv6 socket" "Success. Socket 16287 taken." -
lic-rhel01.ad.ucl.ac.uk - ccsplma@ad.ucl.ac.uk [08/Feb/2021:12:03:30] "Binding IPv4 socket" "Success. Socket 16286 taken." -
lic-rhel01.ad.ucl.ac.uk - ccsplma@ad.ucl.ac.uk [08/Feb/2021:12:03:30] "Hostname" "lic-rhel01.ad.ucl.ac.uk" -
Client tests:
Note: to change to use a different license server, update the user's mathpass file in:
to:
!something.ucl.ac.uk
All ready apart from the boot time stuff which I'll do tomorrow.
Using Rohith's sample unit file:
# /etc/systemd/system/dbora.service
[Unit]
Description=DBora Startup Script
After=network.target
Before=shutdown.target reboot.target halt.target
[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/bin/ksh /home/oracle/com/system_startall.com
ExecStop=/bin/ksh /home/oracle/com/system_stopall.com
User=oracle
[Install]
WantedBy=multi-user.target
I've attempted to create one for the Mathematica License Manager:
# Mathematica licence manager service unit file.
[Unit]
Description=Mathematica License Manager boot start up
After=network.target
Before=shutdown.target reboot.target halt.target
[Service]
User=ccsplma
Type=forking
ExecStart=/bin/bash /usr/local/Wolfram/sbin/boot-script.sh start
ExecStop=/bin/bash /usr/local/Wolfram/sbin/boot-script.sh stop
Restart=on-failure
RestartSec=90
[Install]
WantedBy=multi-user.target graphical.target
But the LM doesn't start up after a reboot. Needs further investigation ...
OK I've rebooted my test license server VM this morning and checked /var/log/messages to find the following:
Feb 10 11:10:34 localhost systemd[1]: Starting Mathematica License Manager boot start up...
Feb 10 11:10:34 localhost systemd[1]: Starting CUPS Scheduler...
Feb 10 11:10:34 localhost bash[1591]: localhost - ccsplma [10/Feb/2021:11:10:34] "MathLM 12.2 executable launched" "/usr/local/Wolfram/MathLM-12.2/mathlm" -
Feb 10 11:10:34 localhost bash[1591]: localhost - ccsplma [10/Feb/2021:11:10:34] "Logging verbosity level specified" "3" -
Feb 10 11:10:34 localhost bash[1591]: Online help is available at
Feb 10 11:10:34 localhost bash[1591]: http://reference.wolfram.com/network
Feb 10 11:10:34 localhost bash[1591]: Mathematica_LM
Feb 10 11:10:34 localhost systemd[1]: Started Mathematica License Manager boot start up.
Feb 10 11:10:34 localhost systemd[1]: mathlm.service: Succeeded.
so it looks like it has started up successfully. But running check-math says:
check-math: Mathematica licence manager is not running!
and:
systemctl status mathlm.service
● mathlm.service - Mathematica License Manager boot start up
Loaded: loaded (/etc/systemd/system/mathlm.service; enabled; vendor preset: >
Active: inactive (dead) since Wed 2021-02-10 11:10:34 GMT; 26min ago
Process: 1658 ExecStop=/bin/bash /usr/local/Wolfram/sbin/boot-script.sh stop >
Process: 1591 ExecStart=/bin/bash /usr/local/Wolfram/sbin/boot-script.sh star>
Main PID: 1656 (code=exited, status=0/SUCCESS)
I've done some reading of the RedHat docs about systemd and changed the type and simplified my unit file to:
# Mathematica licence manager service unit file.
[Unit]
Description=Mathematica License Manager boot start up
After=network.target
Before=shutdown.target reboot.target halt.target
[Service]
User=ccsplma
Type=oneshot
RemainAfterExit=yes
ExecStart=/bin/bash /usr/local/Wolfram/sbin/boot-script.sh start
ExecStop=/bin/bash /usr/local/Wolfram/sbin/boot-script.sh stop
[Install]
WantedBy=multi-user.target graphical.target
This time after a reboot check-math returns:
check-math: Mathematica licence manager running.
and:
systemctl status mathlm.service
● mathlm.service - Mathematica License Manager boot start up
Loaded: loaded (/etc/systemd/system/mathlm.service; enabled; vendor preset: >
Active: active (exited) since Wed 2021-02-10 12:07:48 GMT; 38min ago
Process: 1576 ExecStart=/bin/bash /usr/local/Wolfram/sbin/boot-script.sh star>
Main PID: 1576 (code=exited, status=0/SUCCESS)
Tasks: 1 (limit: 11353)
Memory: 13.2M
CGroup: /system.slice/mathlm.service
└─1658 /usr/local/Wolfram/MathLM-12.2/mathlm -pwfile /usr/local/Wolf>
I now need to test that it is really able to serve licences.
Yay! Licenses are being served.
I'm now going to get HIS to install the MathLM unit file on the new license server and test if it starts correctly after a re-boot.
The new MathLM is now installed and running on the RH8 server. On Wednesday we will test that the LM will start up correctly after a server re-boot.
To aid testing I need an installation of Mathematica 12.2 on Myriad (actually any of the clusters would do) so I've updated the build script and its now running from ccspapp:
cd /shared/ucl/apps/build_scripts
./mathematica-12.2_install
I've had a session with HIS this afternoon to test if the rebooting the new license server starts up MathLM correctly during the reboot. It all worked OK including serving license to Mathematica 12.2 running on a Myriad compute node.
This also confirms that the datacentre firewall change was implemented correctly. I will now raise a CR to migrate the live LM to the new server with a suggested implementation day for Networks next Tuesday.
Change Request CR00008502 raised for implementation on Tuesday 2nd March. Currently pending ISD Change Management approval.
The CR has been approved for installation on Tuesday 2nd March. A warning email has been sent to IT Managers.
The License Manager migration has been successfully completed.
IT Managers have been emailed and ther CR status has been updated to implemented.
Just waiting and watching the LM logs for 24 hours before updating CR status to successful.
I've realised that my quick install of Mathematica 12.2 on Myriad to test the new LM was not done correctly. To fix it I've had to update some extra files in build_scripts:
Running installer again:
cd /shared/ucl/apps/build_scripts
./mathematica-12.2_install
That finished OK so running:
./mathematica-tunnel_install
No errors so I can test a parallel job but that will have to wait until tomorrow.
Parallel test job runs successfully on Myriad with 12 MPI procs = 11 remote kernels.
Will now install on Kathleen so I can do an even bigger test.
Not surprisingly given the filestore issues on Kathleen at present, uncompressing the installer and running it is taking ages ...
I ended up killing the install last night. Started again this morning and it is doing something in:
/shared/ucl/apps/Mathematica/installers/.11195/Unix/Files/Layout.M-LINUX-L
Currently a growing tar archive is being written:
ls -lh
total 436M
-rw-r--r-- 1 ccspapp ccsp 465M Mar 5 12:45 contents.tar.xz
-rw-r--r-- 1 ccspapp ccsp 552 Dec 12 22:38 info
Progress is very slow. Just now:
ls -lh
total 1.1G
-rw-r--r-- 1 ccspapp ccsp 1.2G Mar 5 17:01 contents.tar.xz
-rw-r--r-- 1 ccspapp ccsp 552 Dec 12 22:38 info
The install on Kathleen eventually failed again. I'm going instead to do the install on a compute node.
Attempting to install from a job on Kathleen.
First attempt failed - ran out of wallclock time - I only gave the job an hour. It had got much further than any of the attempts on the login nodes. I've re-submitted the job with 12 hours - should be plenty of time.
Worked I think! I'll now run a test job tomorrow with 79 remote kernels and 1 control.
Test job has been submitted on Kathleen:
job-ID prior name user state submit/start at queue slots ja-task-ID
-----------------------------------------------------------------------------------------------------------------
147685 2.01395 Math_job_R ccaabaa qw 03/15/2021 17:24:06 80
Test job has worked. 79 remote kernel licenses issued. No errors in job.
Mathematica 12.2 installed and tested on:
Mathematica 12.2 will soon be released and we will need to upgrade both the LM software and mathpass file at this point.