stfc / ral-htcondor-tools

Scripts and stuff used with HTCondor at RAL
0 stars 7 forks source link

Direct CMS AAA proxy addresses via local proxy #4

Closed jrha closed 5 years ago

jrha commented 5 years ago

This should ensure that the WN requests can never access Echo data via AAA, hopefully meaning that AAA will be less frequently overloaded with requests and hence be more stable.

The underlying problem being:

  1. RAL WNs running CMS jobs either cannot access a file in Echo, or the connection is dropped during running.
  2. The WN asks the AAA service if the file is available.
  3. The UK AAA-redirector confirms that RAL has the file
  4. The file is accessed, via AAA.
  5. The AAA service becomes overloaded and starts failing SAM tests, affecting our availability in the eyes of CMS. When very busy, jobs running on the WN are also likely to fail, which is inevitable.

Also see RT#229757.