opentargets / issues

Issue tracker for Open Targets Platform and Open Targets Genetics Portal
https://platform.opentargets.org https://genetics.opentargets.org
Apache License 2.0
12 stars 2 forks source link

Sync eQTL Catalogue SuSie outputs to GCP #3213

Closed d0choa closed 4 months ago

d0choa commented 5 months ago

@addramir and @kauralasoo confirmed the eQTL SuSie outputs are ready for ingestion.

This is the first of a series of actions required for the ingestion, harmonisation and processing of eQTL SuSie outputs in gentropy.

The action here is to set up a datamover job that synchronizes all the contents of http://ftp.ebi.ac.uk/pub/databases/spot/eQTL/ into a google cloud location. This setup mimics the GWAS Catalog synchronization which was implemented manually in the next way:

1- Login to node

ssh codon-slurm-login

2- Get the script. Currently in this PR https://github.com/mbdebian/gwas-summary-stats/pull/5 3- Launch script

sbatch simple_data_mover.sh

4- Monitor progress:

watch "tail /nfs/production/opentargets/lsf/logs/ot_gwascat_gcp_rsync-47150371.err"

It would be ideal to setup both processes as CRON job within slurm so that they require minimal intervention from the team.

d0choa commented 4 months ago

As for the GWAS catalog I provisionally added it as a cron job to my personal account:

#SCRON -t 1
#SCRON --mem=1
#SCRON -J gwas_catalog_rsync_cron
# min hour day-of-month month day-of-week command
30 7 * * 1 sbatch /homes/ochoa/gwas-summary-stats/simple_data_mover.sh

Currently testing if it works. Scheduled for Sundays at 7:30. Considering eQTL release process is not incremental at the moment we will keep it manual for now.