This repo contains all the code used to create and maintain the CPIC data model. This powers the API and the database.
If you're looking to use the REST API or get a copy of the database, go read the documentation.
If you want more information about using the API or database, read the documenatation.
If you found a bug or need to discuss something, submit an issue. (Requires GitHub account)
If you want to get a copy of the raw data or code, check the releases.
You probably don't need to read the rest of this.
This section (and the next) are only applicable if you want to build the database from scratch. If you're importing a pre-built database export or using the API you don't need to do any of this. However, if you're interested in seeing an example of how to work with the database in Java code, follow along.
This project assumes you're running a Postgres 11+ database for loading/querying data.
Configuration happens with environment variables. Here's what needs to be set:
localhost
)cpic
)cpic
, but could also be cpic_staging
)For local development you won't need to specify these. Set them if you're running in a different environment like the production or staging servers.
Some steps below will require a compiled version (jar) of this project. Use gradle to build the jar file:
./gradlew jar
or if you're on windows
gradlew.bat jar
This will place a compiled "fat" jar (includes all dependencies) in the build/libs
directory.
If you have an export of the database you do not need to do this. The export has all structure and data already. This section is for creating a bootstrap, mostly-empty version of the database.
This project uses Flyway to set up the DB. Schema definition files are in the src/resources/db/migration
directory.
Run the following to build the db:
java -cp build/libs/CpicData.jar org.cpicpgx.db.FlywayMigrate
There are multiple entity-specific data files, each with their own importer class. The entry points to load gene-specific data are in the org.cpicpgx.importer
package. Check the javadocs on the individual importer classes for command-line parameters.
To load all data at once, use the DataImport
class. This takes a -d
parameter that is a directory with the following sub-folders containing excel files:
Then put that jar on the classpath and run org.cpicpgx.DataImport
class:
java -cp build/libs/CpicData.jar org.cpicpgx.DataImport -d <PATH_TO_DATA_DIRECTORY>
To export file artifacts of compiled data in the database use the DataArtifactArchive
class. It expects a command line argument of a directory to write to. By default, it will write to a subdirectory with a datestamped name. Inside that folder will be subfolders for the different types of exported data.
java -cp build/libs/CpicData.jar org.cpicpgx.DataArtifactArchive -d <PATH_TO_EXISTING_DIRECTORY>
This system relies on postgrest to run the API. The executable can be downloaded from the
postgrest website or installed through a package manager. To run the API you can use the make
target:
make api
This assumes two things:
postgrest
is in your $PATH
To check for dependencies that require updates due to registered vulnerabilities:
./gradlew dependencyCheckAnalyze
You'll see terminal output after a couple of minutes and an HTML report will be generated in build/reports
.
To check for all dependency updates:
./gradlew dependencyUpdates -DoutputFormatter=html
You'll see terminal output and an HTML report will be in build/dependencyUpdates
.
To update the gradle wrapper for the project
./gradlew wrapper --gradle-version <new version>