arXiv / arxiv-browse

Flask app for article abstract and listing pages
MIT License
117 stars 63 forks source link

[ARXIVCE-1756] Grand unified sync-to-gcp #701

Closed ntai-arxiv closed 2 months ago

ntai-arxiv commented 2 months ago

This is intended to replace 2 services - one is syncing the files, the other is to ask webnode the PDF.

image

I need to deploy this and then monitor the logs to see the existing cron jobs of syncing does nothing.

ntai-arxiv commented 2 months ago

The 3 of scripts under the sync_prod_to_gcp are a little confusing. There is a systemd config just for submissions_to_gcp so it seems that this is the only one in use now. Could the sync_published ones be moved to their own dir? Could all the other scripts be moved to their own dirs?

submissions_to_gcp.py uses sync_published_to_gcp.py's ensure_FOO. And, for how, the cron job is still using sync_published_to_gcp.py/sync_published.sh. If any, webnode_pdf_request.py is def. obsolete.

I will think about how to consolidate the 3 services. However, I'd like to keep the files as is since the deployment relies on these python and shell scripts. (for now)

Once the submissons_to_gcp service works without the hitch, we can start the consolidation, starting from stopping the 6 cronjobs. Until then, I would not like to stir the pot too much.