arXiv / arxiv-browse

Flask app for article abstract and listing pages
MIT License
117 stars 63 forks source link

[ARXIVCE-2462] When the tex compiation request is stuck for 30 minutes, send email to the support team. NOTE - the systemd service is already running on the sync node. #684

Closed ntai-arxiv closed 3 months ago

ntai-arxiv commented 3 months ago

I still think it would be better to know which web node it fails on rather than just a generic message saying it failed. Although Charles wanted to discuss whether this should be going to us first. Let's do that before merging.

Each attempt of ping (pdf request) goes to different web node (web5 - web9), so after 30 minutes, every web node is tried and failed after 30 minutes.

https://console.cloud.google.com/logs/query;query=labels.log_type%3D%22arxiv_sync2gcp_log%22%0ASEARCH%2528%22nack%20message%22%2529;summaryFields=:false:32:beginning;cursorTimestamp=2024-08-05T14:02:10.120124999Z;duration=PT1H?referrer=search&project=arxiv-production

In the each log entry, there is a field "count" which varies from 0-4 which corresponds to web5 to web9.