clingen-data-model / clinvar-ingest

Apache License 2.0
2 stars 0 forks source link

Run all prior v2 VCV XML files and copy existing v2 RCV XML entries into the `clinvar_ingest.processing_history` table so they can be paired #250

Open theferrit32 opened 2 weeks ago

theferrit32 commented 2 weeks ago

https://github.com/clingen-data-model/clinvar-ingest/issues/248#issuecomment-2486579391

theferrit32 commented 3 hours ago

Backup processing_history

create table `clingen-dev.clinvar_ingest_backup_20241203.processing_history` 
as select * from `clingen-dev.clinvar_ingest.processing_history`

Delete processing_history

drop table `clingen-dev.clinvar_ingest.processing_history`;
drop view `clingen-dev.clinvar_ingest.processing_history_pairs`;

(These will be automatically re-created on the next run of either clinvar-ingest or bq-ingest workflows)

Test on a single VCV and RCV release

I'm picking one we've already run before.

VCV and RCV release date: 2024-10-27

VCV ftp watcher record:

[{"Name":"ClinVarVCVRelease_2024-1027.xml.gz","Size":4120656527,"Released":"2024-10-28 06:11:11","Last Modified":"2024-10-28 06:11:11","Directory":"\/pub\/clinvar\/xml\/weekly_release","Host":"https:\/\/ftp.ncbi.nlm.nih.gov","Release Date":"2024-10-27"}]

RCV ftp watcher record:

[{"Name":"ClinVarRCVRelease_2024-1027.xml.gz","Size":4561188864,"Released":"2024-10-28 06:11:13","Last Modified":"2024-10-28 06:11:13","Directory":"\/pub\/clinvar\/xml\/RCV_release\/weekly_release","Host":"https:\/\/ftp.ncbi.nlm.nih.gov","Release Date":"2024-10-27"}]