Is your feature request related to a problem? Please describe.
With the migration to Alma9, we also started seeing vm_kill and condor_schedd restarts every now and then. Discussing these with the SI team (Marco M.), he suggested to increase the production WMAgent HTCondor spool area, which is currently defined at 8GB size.
Describe the solution you'd like
Follow up with the VoC and gradually increase the /mnt/ramdisk partition area from 8GB to 12GB. Nodes that are not in use can be modified right away, while those that are active will have to wait until we can stop services.
Describe alternatives you've considered
None
Additional context
Latest condor_schedd restart and vm_kill dates from Oct/22/2024, on vocms0282.
Impact of the new feature WMAgent
Is your feature request related to a problem? Please describe. With the migration to Alma9, we also started seeing
vm_kill
and condor_schedd restarts every now and then. Discussing these with the SI team (Marco M.), he suggested to increase the production WMAgent HTCondor spool area, which is currently defined at 8GB size.Describe the solution you'd like Follow up with the VoC and gradually increase the
/mnt/ramdisk
partition area from 8GB to 12GB. Nodes that are not in use can be modified right away, while those that are active will have to wait until we can stop services.Describe alternatives you've considered None
Additional context Latest condor_schedd restart and vm_kill dates from Oct/22/2024, on vocms0282.