aws-samples / data-lake-as-code

Data Lake as Code, featuring ChEMBL and OpenTargets
MIT No Attribution
162 stars 43 forks source link

OpenTargets: some data in the tables is missing #26

Closed yuki04160 closed 1 year ago

yuki04160 commented 1 year ago

Hi,

I used OpenTargets CloudFormation to create an AWS Glue catalog and queried data using Athena. However, I recently noticed that some data in the tables is missing, such as in the searchdisease, searchdrug, searchtarget, and molecule tables.

I'm certain that the data was previously there, but I'm not sure why it disappeared, or if it was just my case. I checked and it's possible that the data was removed from the source (https://platform.opentargets.org/downloads/data), and as a result, was also removed from the associated Glue tables.

Could you please check the issue? Thank you!

Best, Yuki

paulu-aws commented 1 year ago

The OpenTargets source data ended up exceeding the storage capacity of the baseline ec2 instance and only moved what it could. We've increased the capacity for that host and now the tables are updating as expected. Sorry for the delay in addressing this.