Following is a brief introduction on how to build the DNA methylation database schema and tables. The implementation of these procedures may require further discussions and modifications, to coordinate with the development of DNA methylation API plot and table endpoints.
Add a new DNA methylation database schema to store all DNA methylation tables, to avoid table name collisions with existing tables in bulk_expression schema. The DNA methylation schema can have any appropriate name, and the following description uses dna_methylation as an example schema name. Following are the steps to add in a new dna_methylation schema:
CREATE SCHEMA ${DNA_METHYLATION_SCHEMA};
GRANT SELECT ON ALL TABLES IN SCHEMA ${DNA_METHYLATION_SCHEMA} TO ${DB_USERNAME};
ALTER DEFAULT PRIVILEGES IN SCHEMA ${DNA_METHYLATION_SCHEMA} GRANT SELECT ON TABLES TO ${DB_USERNAME};
GRANT USAGE ON SCHEMA ${DNA_METHYLATION_SCHEMA} TO ${DB_USERNAME};
GRANT SELECT ON ALL TABLES IN SCHEMA ${DNA_METHYLATION_SCHEMA} TO ${DB_USERNAME};
ALTER DEFAULT PRIVILEGES IN SCHEMA ${DNA_METHYLATION_SCHEMA} GRANT SELECT ON TABLES TO ${DB_USERNAME};
Add new DNA methylation tables to store DNA methylation results that are required to generate plots and tables for MTP. The DNA methylation tables can have any appropriate names, and the following description uses dna_methylation_summary as an example table name. Following are the steps to add a new dna_methylation_summary table:
# dna_methylation_summary_tbl is assumed to be the table for loading into the database.
#
# dna_methylation_summary_tbl does not have to be the full table, and it can only have zero or a few lines.
db_write_table(
dplyr::slice(dna_methylation_summary_tbl, 0), conn,
tolower(db_env_vars$DNA_METHYLATION_SCHEMA),
tolower(db_env_vars$DNA_METHYLATION_SUMMARY_TBL))
Upload the prepared CSV file to a S3 bucket as new OpenPedCan-analysis data release.
The prepared CSV file will be downloaded to OpenPedCan-analysis/data/ by db/build_db.sh.
If prepared in db/build_tools/build_db.R:
Upload a result file to a S3 bucket as new OpenPedCan-analysis data release. The result file can be in any format, which needs further preparation before loading into the database.
The result file will be downloaded to OpenPedCan-analysis/data/ by db/build_db.sh.
Load the result file in db/build_tools/build_db.R.
Prepare the table for database loading, such as filtering, renaming columns, and reordering columns.
Build DNA methylation database schema and tables using results generated from https://github.com/PediatricOpenTargets/OpenPedCan-analysis/pull/165. According to @ewafula, the results in https://github.com/PediatricOpenTargets/OpenPedCan-analysis/pull/165 are still under preparation.
Following is a brief introduction on how to build the DNA methylation database schema and tables. The implementation of these procedures may require further discussions and modifications, to coordinate with the development of DNA methylation API plot and table endpoints.
Add a new DNA methylation database schema to store all DNA methylation tables, to avoid table name collisions with existing tables in
bulk_expression
schema. The DNA methylation schema can have any appropriate name, and the following description usesdna_methylation
as an example schema name. Following are the steps to add in a newdna_methylation
schema:Add
DNA_METHYLATION_SCHEMA=dna_methylation
in../OpenPedCan-api-secrets/common_db.env
.Add
ENV DNA_METHYLATION_SCHEMA="dna_methylation"
inDockerfile
.Add the following code in
db/init_db.sh
:Add
DNA_METHYLATION_SCHEMA = "DNA_METHYLATION_SCHEMA"
indb/r_interfaces/db_env_vars.R
.Add the following code in
db/load_db.sh
:Add
--schema="$DNA_METHYLATION_SCHEMA"
indb/build_tools/build_db_docker_cmd.sh
to includeDNA_METHYLATION_SCHEMA
in the database dump.Add new DNA methylation tables to store DNA methylation results that are required to generate plots and tables for MTP. The DNA methylation tables can have any appropriate names, and the following description uses
dna_methylation_summary
as an example table name. Following are the steps to add a newdna_methylation_summary
table:Add
DNA_METHYLATION_SUMMARY_TBL=dna_methylation_summary
in../OpenPedCan-api-secrets/common_db.env
.Add
ENV DNA_METHYLATION_SUMMARY_TBL="dna_methylation_summary"
inDockerfile
.Add
DNA_METHYLATION_SUMMARY_TBL = "DNA_METHYLATION_SUMMARY_TBL"
indb/r_interfaces/db_env_vars.R
.Create an empty table in
db/build_tools/build_db.R
:Prepare a CSV file for loading into the database. The CSV files can either be prepared in https://github.com/PediatricOpenTargets/OpenPedCan-analysis/pull/165 or
db/build_tools/build_db.R
.OpenPedCan-analysis/data/
bydb/build_db.sh
.db/build_tools/build_db.R
:OpenPedCan-analysis/data/
bydb/build_db.sh
.db/build_tools/build_db.R
.db/build_tools/build_db.R
.Add the following code in
db/build_tools/build_db_docker_cmd.sh
to load the prepared CSV file to the database:Add the following code in
db/build_tools/build_db_docker_cmd.sh
to add a table index on the column that will be queried:The updated database building procedure can be tested using
./db/build_db.sh
.cc @afarrel @chinwallaa