PediatricOpenTargets / OpenPedCan-api

2 stars 7 forks source link

📫Implement `/tpm/gene-disease-gtex/json` and `/tpm/gene-disease-gtex/plot` API endpoints #12

Closed logstar closed 3 years ago

logstar commented 3 years ago

Pull Request Template

Description

This PR is an update of https://github.com/PediatricOpenTargets/OpenPedCan-api/pull/11, in order to deploy the PR changes on dev server at https://openpedcan-api-dev.d3b.io/__docs__/. The dev server only deploys PRs from the main PediatricOpenTargets/OpenPedCan-api repository.

Implemented the HTTP GET methods of /tpm/gene-disease-gtex/plot and /tpm/gene-disease-gtex/plot endpoints. These two endpoints handle HTTP requests for OpenPedCan-analysis cancer_group and gtex_subgroup boxplot and summary table, according to the API specifications in https://nih.box.com/s/5cq2jwi6bhg0mgnowad3e6e4i60hwbnr.

When a new version of OpenPedCan-api server is deployed, all genes and all samples are included, and specific numbers are listed below, which take 7.5 GB memory constantly. Each HTTP request is handled sequentially, with an extra ~0.3GB memory used temporarily. The dev server has 10GB memory limit.

---------------------------------
 2021-08-31 16:46:50
 Primary tumor all-cohorts independent n samples:  1948
 Primary tumor each-cohort independent n samples:  1963
 GTEx all n samples:  17382
 Number of genes:  38939
---------------------------------

The /tpm/gene-all-cancer/json and /tpm/gene-all-cancer/plot endpoints are placeholders, which will be implemented next. More development action items are described in the "API Development roadmap" section of README.md.

Type of change

Please delete options that are not relevant.

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

Test Configuration:

Working directory is the git repository root directory, i.e. the directory
that contains the .git directory of the repository.

ubuntu 18
docker 19.03
curl 7.78
sha256sum 8.28 # shasum for Mac OS, but not tested for any version.
R 4.1
R package readr 1.4.0
R package jsonlite 1.7.2
R package lintr 2.0.1
# working directory is the project directory (the directory that contains .git of this git repo)
#
# git should check out this PR branch
bash tests/run_r_lintr.sh

docker build --no-cache -t open-ped-can-api .
docker run --rm -p 8082:80 -e DEBUG=1 open-ped-can-api

bash tests/curl_test_endpoints.sh

For more details on test options and resources, see "Test run OpenPedCan-api server locally" section of README.md.

Terminal returns:

$ ./tests/curl_test_endpoints.sh 

# ... 20 blank lines to separate from previous commands

GET http://localhost:8082/tpm/gene-disease-gtex/json?ensemblId=ENSG00000213420&efoId=EFO_0000621
http_code: 200
content_type: application/json
time_total: 1.162424 seconds

GET http://localhost:8082/tpm/gene-disease-gtex/plot?ensemblId=ENSG00000213420&efoId=EFO_0000621
http_code: 200
content_type: image/png
time_total: 2.420127 seconds

GET http://localhost:8082/tpm/gene-disease-gtex/json?ensemblId=ENSG00000213420&efoId=EFO_0005543
http_code: 200
content_type: application/json
time_total: 0.784063 seconds

GET http://localhost:8082/tpm/gene-disease-gtex/plot?ensemblId=ENSG00000213420&efoId=EFO_0005543
http_code: 200
content_type: image/png
time_total: 1.827895 seconds

GET http://localhost:8082/tpm/gene-all-cancer/json?ensemblId=ENSG00000213420
http_code: 200
content_type: application/json
time_total: 0.004517 seconds

GET http://localhost:8082/tpm/gene-all-cancer/plot?ensemblId=ENSG00000213420
http_code: 200
content_type: image/png
time_total: 0.366855 seconds

GET http://localhost:8082/tpm/gene-disease-gtex/json?ensemblId=ENSG00000157764&efoId=EFO_0000621
http_code: 200
content_type: application/json
time_total: 0.598361 seconds

GET http://localhost:8082/tpm/gene-disease-gtex/plot?ensemblId=ENSG00000157764&efoId=EFO_0000621
http_code: 200
content_type: image/png
time_total: 1.917991 seconds

GET http://localhost:8082/tpm/gene-disease-gtex/json?ensemblId=ENSG00000157764&efoId=EFO_0005543
http_code: 200
content_type: application/json
time_total: 0.716320 seconds

GET http://localhost:8082/tpm/gene-disease-gtex/plot?ensemblId=ENSG00000157764&efoId=EFO_0005543
http_code: 200
content_type: image/png
time_total: 2.043048 seconds

GET http://localhost:8082/tpm/gene-all-cancer/json?ensemblId=ENSG00000157764
http_code: 200
content_type: application/json
time_total: 0.004528 seconds

GET http://localhost:8082/tpm/gene-all-cancer/plot?ensemblId=ENSG00000157764
http_code: 200
content_type: image/png
time_total: 0.391903 seconds

GET http://localhost:8082/tpm/gene-disease-gtex/json?ensemblId=ENSG00000273032&efoId=EFO_0000621
http_code: 200
content_type: application/json
time_total: 0.842756 seconds

GET http://localhost:8082/tpm/gene-disease-gtex/plot?ensemblId=ENSG00000273032&efoId=EFO_0000621
http_code: 200
content_type: image/png
time_total: 2.232956 seconds

GET http://localhost:8082/tpm/gene-disease-gtex/json?ensemblId=ENSG00000273032&efoId=EFO_0005543
http_code: 200
content_type: application/json
time_total: 0.862582 seconds

GET http://localhost:8082/tpm/gene-disease-gtex/plot?ensemblId=ENSG00000273032&efoId=EFO_0005543
http_code: 200
content_type: image/png
time_total: 2.001169 seconds

GET http://localhost:8082/tpm/gene-all-cancer/json?ensemblId=ENSG00000273032
http_code: 200
content_type: application/json
time_total: 0.004363 seconds

GET http://localhost:8082/tpm/gene-all-cancer/plot?ensemblId=ENSG00000273032
http_code: 200
content_type: image/png
time_total: 0.370939 seconds

Note for reviewers

This PR may not need to be converted to multiple stepwise PRs, because all files are required for test running this PR.

There are also only 915 lines of implementation code, and nearly half of them are comments and assertions.

$ wc -l ./*/*.R
  306 ./db/tpm_data_lists.R
   29 ./src/add_gene_tpm_box_group.R
   81 ./src/get_gene_tpm_boxplot.R
   49 ./src/get_gene_tpm_boxplot_summary_tbl.R
   60 ./src/get_gene_tpm_boxplot_tbl.R
  235 ./src/get_gene_tpm_tbl.R
   40 ./src/ggplot2_publication_theme.R
  115 ./src/plumber.R
  915 total

Checklist

blackdenc commented 3 years ago

Dev environment looks stable, ready to release to QA and PRD. Ready to merge from Devops perspective.

logstar commented 3 years ago

@taylordm @jharenza @chinwallaa @komalsrathi @jonkiky @blackdenc - This PR will be merged soon without any comprehensive code and results review, so that @blackdenc can update development/deployment environment to be more stable, i.e. less likely to be down, and new PRs can be deployed properly. This will greatly speed up the following API developments.

However, this PR can still be reviewed. If you have any questions, suggestions, and comments, feel free to leave comments and reviews in this PR. I will revise accordingly.