DIRACGrid / DIRAC

DIRAC Grid
http://diracgrid.org
GNU General Public License v3.0
114 stars 175 forks source link

Badly handled error in SiteDirector #7338

Open chrisburr opened 10 months ago

chrisburr commented 10 months ago

One of the LHCb SiteDirectors had a separate issue which then resulted in a crash in DIRAC:

Traceback (most recent call last):
  File "/opt/dirac/versions/v11.0.24-1700482381/Linux-x86_64/lib/python3.11/site-packages/DIRAC/Core/Base/AgentModule.py", line 310, in am_secureCall
    result = functor(*args)
             ^^^^^^^^^^^^^^
  File "/opt/dirac/versions/v11.0.24-1700482381/Linux-x86_64/lib/python3.11/site-packages/DIRAC/WorkloadManagementSystem/Agent/SiteDirector.py", line 331, in execute
    result = self.submitPilots()
             ^^^^^^^^^^^^^^^^^^^
  File "/opt/dirac/versions/v11.0.24-1700482381/Linux-x86_64/lib/python3.11/site-packages/DIRAC/WorkloadManagementSystem/Agent/SiteDirector.py", line 457, in submitPilots
    res = self._submitPilotsToQueue(pilotsToSubmit, ce, queueName)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/dirac/versions/v11.0.24-1700482381/Linux-x86_64/lib/python3.11/site-packages/DIRAC/WorkloadManagementSystem/Agent/SiteDirector.py", line 748, in _submitPilotsToQueue
    submitResult = ce.submitJob(executable, "", pilotsToSubmit)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/dirac/versions/v11.0.24-1700482381/Linux-x86_64/lib/python3.11/site-packages/DIRAC/Resources/Computing/SSHComputingElement.py", line 550, in submitJob
    result = self._submitJobToHost(submitFile, numberOfJobs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/dirac/versions/v11.0.24-1700482381/Linux-x86_64/lib/python3.11/site-packages/DIRAC/Resources/Computing/SSHComputingElement.py", line 597, in _submitJobToHost
    if result["Status"] != 0:
       ~~~~~~^^^^^^^^^^
TypeError: string indices must be integers, not 'str'

where result was

Traceback (most recent call last):
  File "/home/dirac/execute_batch", line 354, in <module>
    result = getattr(batch, method)(**inputDict)
  File "/home/dirac/execute_batch", line 102, in submitJob
    jdlFile.write(
  File "/usr/lib64/python3.9/tempfile.py", line 478, in func_wrapper
    return func(*args, **kwargs)
TypeError: a bytes-like object is required, not 'str'
chrisburr commented 10 months ago

There is another bug than https://github.com/DIRACGrid/DIRAC/pull/7340 that result might not be a dict for if result["Status"] != 0: