PacificBiosciences / HiFi-human-WGS-WDL

BSD 3-Clause Clear License
52 stars 29 forks source link

DeepVariant Model #139

Closed Giddoo closed 3 months ago

Giddoo commented 3 months ago

I am trying to run the WDL workflow on a PacBio Revio test sample and I would like to include the PacBio HiFi model although it is optional. How do I access the files (data, index data and metadata) for this model?

williamrowell commented 3 months ago

The default model packaged directly in the DeepVariant docker image is the best available DeepVariant v1.5 PacBio model for human WGS. If you delete the "humanwgs.deepvariant_model" key/value in the templates, the default model will be used. It is very rarely necessary to provide a custom model and override the default PacBio model for DeepVariant v1.5.

Any models for newer versions of DeepVariant (e.g. v1.6.*) are incompatible with this workflow release because of changes to the DeepVariant command calls and model structure. The only reason to provide a custom model is for testing new models. No further support will be provided for custom DeepVariant models in this repo.

The structure of a DeepVariant v1.5 model directory in the Docker image is:

> ls -lah /opt/models/pacbio/
total 366M
drwxr-xr-x 2 root root  148 Feb 25  2023 .
drwxr-xr-x 7 root root  145 Feb 25  2023 ..
-rw-r--r-- 1 root root 333M Feb 24  2023 model.ckpt.data-00000-of-00001
-rw-r--r-- 1 root root   86 Feb 24  2023 model.ckpt.example_info.json
-rw-r--r-- 1 root root  19K Feb 24  2023 model.ckpt.index
-rw-r--r-- 1 root root  33M Feb 24  2023 model.ckpt.meta

The three files to be referenced are data, index, and meta. Example of this input struct below:

"humanwgs.deepvariant_model": {
  "model": {
    "data": "/path/on/hpc/host/to/model.ckpt.data-00000-of-00001",
    "data_index": "/path/on/hpc/host/to/model.ckpt.index"
  },
  "metadata": "/path/on/hpc/host/to/model.ckpt.meta"
}

This section of the DeepVariant v1.5.0 Dockerfile puts the models into the Docker container: https://github.com/google/deepvariant/blob/ab068c4588a02e2167051bd9e74c0c9579462b51/Dockerfile#L168-L201