Open StevenCTimm opened 4 hours ago
so there's a problem communicating to the new OSG factory. Will see if anyone's watching on OSG slack over the weekend, and open a ticket.
the condor_status -any output from gfactory-1.osg-htc.org shows that it is up but the factory on it is not.
osg ticket 77867 is filed, see above url
Requested job ids and error messages if available.
This from dunegpfrontend01 [2024-10-13 15:15:08,524] ERROR: glideinFrontendElement:1886: Failed to talk to factory_pool for global info: Traceback (most recent call last): File "/usr/lib/python3.9/site-packages/glideinwms/lib/condorMonitor.py", line 695, in fetch_using_bindings results = collector.query(adtype, constraint, attrs) File "/usr/lib64/python3.9/site-packages/htcondor/_lock.py", line 70, in wrapper rv = func(*args, **kwargs) htcondor.HTCondorIOError: Failed communication with collector.
The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/usr/lib/python3.9/site-packages/glideinwms/frontend/glideinFrontendElement.py", line 1877, in query_globals factory_globals_dict = glideinFrontendInterface.findGlobals( File "/usr/lib/python3.9/site-packages/glideinwms/frontend/glideinFrontendInterface.py", line 169, in findGlobals status.load(status_constraint) File "/usr/lib/python3.9/site-packages/glideinwms/lib/condorMonitor.py", line 576, in load self.stored_data = self.fetch(constraint, format_list) File "/usr/lib/python3.9/site-packages/glideinwms/lib/condorMonitor.py", line 675, in fetch return CondorQuery.fetch(self, constraint=constraint, format_list=format_list) File "/usr/lib/python3.9/site-packages/glideinwms/lib/condorMonitor.py", line 506, in fetch raise QueryError(err_str) from ex