UCL-RITS / rcps-buildscripts

Scripts to automate package builds on RC Platforms
MIT License
39 stars 27 forks source link

Mathematica: Upgrade UCL License Manager for Mathematica 12.2 and Install 12.2 #395

Closed balston closed 2 years ago

balston commented 3 years ago

Mathematica 12.2 will soon be released and we will need to upgrade both the LM software and mathpass file at this point.

balston commented 3 years ago

Downloaded the MathLM 12.2 installer and the installers for Mathematica 12.2 for Linux, Mac and Windows.

balston commented 3 years ago

The new LM software will not run on RedHat 6.x so I'm doing a test install using my RedHat 8 test license server VM. We will have to install directly to the new UCL Wide RedHat 8 license server. Firewall rules are still being set up for the new server so there may be a delay in getting this installed.

balston commented 3 years ago

I've finally got the 12.2 license manager now running on my RedHat 8 license server VM after a bit of fun with the new (to me) systemd boot time stuff. A change from the previous license server's SysV init scripts.

Next week I will test if the LM can serve license to Mathematica 12.2 clients.

balston commented 3 years ago

OK so I re-booted by test license server and the Mathematica LM didn't start up correctly. However if I run the:

start-math

command as ccsplma it does start. I will need to investigate this on Monday.

balston commented 3 years ago

The root preparation section:

cd /usr/local
mkdir Wolfram
chown ccsplma:ccsplma@ad.ucl.ac.uk Wolfram/
chmod o-rx Wolfram

cd /var/log
touch lm_mathematica.log
chown ccsplma:ccsplma@ad.ucl.ac.uk lm_mathematica.log

Open Firewall Ports:


firewall-cmd --zone=public --permanent --add-port 16286/tcp
firewall-cmd --reload

add to Puppet config

has now been done on the new RH8 license server. Next week I will install the LM ready for live testing.

balston commented 3 years ago

I'm running the MathLM installer on the RH8 license server ...

balston commented 3 years ago

The license manager has been installed. Note: installation instructions will be in the Change Request. Now to see if it will start up and serve licenses to Mathematica clients.

balston commented 3 years ago

LM started as ccspapp using:

cd /usr/local/Wolfram/sbin
./start-math

Appears to be running - from the log:

lic-rhel01.ad.ucl.ac.uk - ccsplma@ad.ucl.ac.uk [08/Feb/2021:12:03:30] "MathLM 12.2 executable launched" "/usr/local/Wolfram/MathLM-12.2/mathlm" -
lic-rhel01.ad.ucl.ac.uk - ccsplma@ad.ucl.ac.uk [08/Feb/2021:12:03:30] "Verbosity level specified" "1" -
lic-rhel01.ad.ucl.ac.uk - ccsplma@ad.ucl.ac.uk [08/Feb/2021:12:03:30] "Logging verbosity level specified" "3" -

Online help is available at
http://reference.wolfram.com/network

lic-rhel01.ad.ucl.ac.uk - ccsplma@ad.ucl.ac.uk [08/Feb/2021:12:03:30] "Binding IPv6 socket" "Success.  Socket 16287 taken." -
lic-rhel01.ad.ucl.ac.uk - ccsplma@ad.ucl.ac.uk [08/Feb/2021:12:03:30] "Binding IPv4 socket" "Success.  Socket 16286 taken." -
lic-rhel01.ad.ucl.ac.uk - ccsplma@ad.ucl.ac.uk [08/Feb/2021:12:03:30] "Hostname" "lic-rhel01.ad.ucl.ac.uk" -
balston commented 3 years ago

Client tests:

balston commented 3 years ago

Note: to change to use a different license server, update the user's mathpass file in:

to:

!something.ucl.ac.uk
balston commented 3 years ago

All ready apart from the boot time stuff which I'll do tomorrow.

balston commented 3 years ago

Using Rohith's sample unit file:

# /etc/systemd/system/dbora.service
[Unit]
Description=DBora Startup Script
After=network.target
Before=shutdown.target reboot.target halt.target
[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/bin/ksh /home/oracle/com/system_startall.com
ExecStop=/bin/ksh /home/oracle/com/system_stopall.com
User=oracle
[Install]
WantedBy=multi-user.target

I've attempted to create one for the Mathematica License Manager:

#      Mathematica licence manager service unit file.

[Unit]
Description=Mathematica License Manager boot start up
After=network.target
Before=shutdown.target reboot.target halt.target

[Service]
User=ccsplma
Type=forking
ExecStart=/bin/bash /usr/local/Wolfram/sbin/boot-script.sh start
ExecStop=/bin/bash /usr/local/Wolfram/sbin/boot-script.sh stop
Restart=on-failure
RestartSec=90

[Install]
WantedBy=multi-user.target graphical.target

But the LM doesn't start up after a reboot. Needs further investigation ...

balston commented 3 years ago

OK I've rebooted my test license server VM this morning and checked /var/log/messages to find the following:

Feb 10 11:10:34 localhost systemd[1]: Starting Mathematica License Manager boot start up...
Feb 10 11:10:34 localhost systemd[1]: Starting CUPS Scheduler...

Feb 10 11:10:34 localhost bash[1591]: localhost - ccsplma [10/Feb/2021:11:10:34] "MathLM 12.2 executable launched" "/usr/local/Wolfram/MathLM-12.2/mathlm" -
Feb 10 11:10:34 localhost bash[1591]: localhost - ccsplma [10/Feb/2021:11:10:34] "Logging verbosity level specified" "3" -
Feb 10 11:10:34 localhost bash[1591]: Online help is available at
Feb 10 11:10:34 localhost bash[1591]: http://reference.wolfram.com/network
Feb 10 11:10:34 localhost bash[1591]: Mathematica_LM
Feb 10 11:10:34 localhost systemd[1]: Started Mathematica License Manager boot start up.
Feb 10 11:10:34 localhost systemd[1]: mathlm.service: Succeeded.

so it looks like it has started up successfully. But running check-math says:

check-math: Mathematica licence manager is not running!

and:

systemctl status mathlm.service
● mathlm.service - Mathematica License Manager boot start up
   Loaded: loaded (/etc/systemd/system/mathlm.service; enabled; vendor preset: >
   Active: inactive (dead) since Wed 2021-02-10 11:10:34 GMT; 26min ago
  Process: 1658 ExecStop=/bin/bash /usr/local/Wolfram/sbin/boot-script.sh stop >
  Process: 1591 ExecStart=/bin/bash /usr/local/Wolfram/sbin/boot-script.sh star>
 Main PID: 1656 (code=exited, status=0/SUCCESS)
balston commented 3 years ago

I've done some reading of the RedHat docs about systemd and changed the type and simplified my unit file to:

#      Mathematica licence manager service unit file.

[Unit]
Description=Mathematica License Manager boot start up
After=network.target
Before=shutdown.target reboot.target halt.target

[Service]
User=ccsplma
Type=oneshot
RemainAfterExit=yes
ExecStart=/bin/bash /usr/local/Wolfram/sbin/boot-script.sh start
ExecStop=/bin/bash /usr/local/Wolfram/sbin/boot-script.sh stop

[Install]
WantedBy=multi-user.target graphical.target

This time after a reboot check-math returns:

check-math: Mathematica licence manager running.

and:

systemctl status mathlm.service
● mathlm.service - Mathematica License Manager boot start up
   Loaded: loaded (/etc/systemd/system/mathlm.service; enabled; vendor preset: >
   Active: active (exited) since Wed 2021-02-10 12:07:48 GMT; 38min ago
  Process: 1576 ExecStart=/bin/bash /usr/local/Wolfram/sbin/boot-script.sh star>
 Main PID: 1576 (code=exited, status=0/SUCCESS)
    Tasks: 1 (limit: 11353)
   Memory: 13.2M
   CGroup: /system.slice/mathlm.service
           └─1658 /usr/local/Wolfram/MathLM-12.2/mathlm -pwfile /usr/local/Wolf>

I now need to test that it is really able to serve licences.

balston commented 3 years ago

Yay! Licenses are being served.

balston commented 3 years ago

I'm now going to get HIS to install the MathLM unit file on the new license server and test if it starts correctly after a re-boot.

balston commented 3 years ago

The new MathLM is now installed and running on the RH8 server. On Wednesday we will test that the LM will start up correctly after a server re-boot.

To aid testing I need an installation of Mathematica 12.2 on Myriad (actually any of the clusters would do) so I've updated the build script and its now running from ccspapp:

cd /shared/ucl/apps/build_scripts
./mathematica-12.2_install
balston commented 3 years ago

I've had a session with HIS this afternoon to test if the rebooting the new license server starts up MathLM correctly during the reboot. It all worked OK including serving license to Mathematica 12.2 running on a Myriad compute node.

This also confirms that the datacentre firewall change was implemented correctly. I will now raise a CR to migrate the live LM to the new server with a suggested implementation day for Networks next Tuesday.

balston commented 3 years ago

Change Request CR00008502 raised for implementation on Tuesday 2nd March. Currently pending ISD Change Management approval.

balston commented 3 years ago

The CR has been approved for installation on Tuesday 2nd March. A warning email has been sent to IT Managers.

balston commented 3 years ago

The License Manager migration has been successfully completed.

IT Managers have been emailed and ther CR status has been updated to implemented.

balston commented 3 years ago

Just waiting and watching the LM logs for 24 hours before updating CR status to successful.

balston commented 3 years ago

I've realised that my quick install of Mathematica 12.2 on Myriad to test the new LM was not done correctly. To fix it I've had to update some extra files in build_scripts:

Running installer again:

cd /shared/ucl/apps/build_scripts
./mathematica-12.2_install
balston commented 3 years ago

That finished OK so running:

./mathematica-tunnel_install
balston commented 3 years ago

No errors so I can test a parallel job but that will have to wait until tomorrow.

balston commented 3 years ago

Parallel test job runs successfully on Myriad with 12 MPI procs = 11 remote kernels.

Will now install on Kathleen so I can do an even bigger test.

balston commented 3 years ago

Not surprisingly given the filestore issues on Kathleen at present, uncompressing the installer and running it is taking ages ...

balston commented 3 years ago

I ended up killing the install last night. Started again this morning and it is doing something in:

/shared/ucl/apps/Mathematica/installers/.11195/Unix/Files/Layout.M-LINUX-L

Currently a growing tar archive is being written:

ls -lh
total 436M
-rw-r--r-- 1 ccspapp ccsp 465M Mar  5 12:45 contents.tar.xz
-rw-r--r-- 1 ccspapp ccsp  552 Dec 12 22:38 info
balston commented 3 years ago

Progress is very slow. Just now:

ls -lh
total 1.1G
-rw-r--r-- 1 ccspapp ccsp 1.2G Mar  5 17:01 contents.tar.xz
-rw-r--r-- 1 ccspapp ccsp  552 Dec 12 22:38 info
balston commented 3 years ago

The install on Kathleen eventually failed again. I'm going instead to do the install on a compute node.

balston commented 3 years ago

Attempting to install from a job on Kathleen.

balston commented 3 years ago

First attempt failed - ran out of wallclock time - I only gave the job an hour. It had got much further than any of the attempts on the login nodes. I've re-submitted the job with 12 hours - should be plenty of time.

balston commented 3 years ago

Worked I think! I'll now run a test job tomorrow with 79 remote kernels and 1 control.

balston commented 3 years ago

Test job has been submitted on Kathleen:

job-ID  prior   name       user         state submit/start at     queue                          slots ja-task-ID
-----------------------------------------------------------------------------------------------------------------
 147685 2.01395 Math_job_R ccaabaa      qw    03/15/2021 17:24:06                                   80
balston commented 3 years ago

Test job has worked. 79 remote kernel licenses issued. No errors in job.

Mathematica 12.2 installed and tested on: