hms-dbmi / chromoscope

Interactive multiscale visualization for structural variation in human genomes
https://chromoscope.bio/
MIT License
60 stars 6 forks source link

chore: add unit tests for python pkg #99

Closed manzt closed 1 year ago

manzt commented 1 year ago
manzt commented 1 year ago

@sehilyi - on the python side, it might be worth looking at https://pydantic.dev/ to validate the config objects passed in

manzt commented 1 year ago

Yup, maybe have a look at servir's release workflow. You'll need to publish manually once locally (hatch run build && hatch publish). But then on PyPI you can create an API token for this repo, and add a secret so that publish are made with tagged commits).

manzt commented 1 year ago

Oh, the other thing to mention about pydantic is that you can generate JSON schema from the models.

from typing import Literal, Union

from pydantic import BaseModel, Field, HttpUrl

class Config(BaseModel):
    id: str
    cancer: str
    assembly: Literal["hg38", "hg19"]
    sv: HttpUrl
    cnv: HttpUrl = Field(title="CNV", description="An URL of the CNV text file (.txt).")
    drivers: Union[HttpUrl, None] = Field(
        default=None,
        title="Drivers",
        description="An URL of a file that contains drivers (.txt).",
    )
    vcf: Union[HttpUrl, None] = Field(
        default=None,
        title="VCF",
        description="An URL of the point mutation file (.vcf).",
    )
    vcfIndex: Union[HttpUrl, None] = Field(
        default=None,
        title="VCF Index",
        description="An URL of the point mutation index file (.tbi).",
    )
    vcf2: Union[HttpUrl, None] = Field(
        default=None,
        title="VCF2",
        description="An URL of the the indel file (.vcf).",
    )
    vcf2Index: Union[HttpUrl, None] = Field(
        default=None,
        title="VCF2 Index",
        description="An URL of the indel index file (.tbi).",
    )
    bam: Union[HttpUrl, None] = Field(
        default=None,
        title="BAM",
        description="An URL of the BAM file (.bam)."
    )
    bamIndex: Union[HttpUrl, None] = Field(
        default=None,
        title="BAM Index",
        description="An URL of the BAM index file (.bai)."
    )
    note: Union[str, None] = Field(
        default=None,
        title="Note",
        description="A textual annotation.",
    )

if __name__ == "__main__":
    print(Config.schema_json(indent=2))
{
  "title": "Config",
  "type": "object",
  "properties": {
    "id": {
      "title": "Id",
      "type": "string"
    },
    "cancer": {
      "title": "Cancer",
      "type": "string"
    },
    "assembly": {
      "title": "Assembly",
      "enum": [
        "hg38",
        "hg19"
      ],
      "type": "string"
    },
    "sv": {
      "title": "Sv",
      "minLength": 1,
      "maxLength": 2083,
      "format": "uri",
      "type": "string"
    },
    "cnv": {
      "title": "CNV",
      "description": "An URL of the CNV text file (.txt).",
      "minLength": 1,
      "maxLength": 2083,
      "format": "uri",
      "type": "string"
    },
    "drivers": {
      "title": "Drivers",
      "description": "An URL of a file that contains drivers (.txt).",
      "minLength": 1,
      "maxLength": 2083,
      "format": "uri",
      "type": "string"
    },
    "vcf": {
      "title": "VCF",
      "description": "An URL of the point mutation file (.vcf).",
      "minLength": 1,
      "maxLength": 2083,
      "format": "uri",
      "type": "string"
    },
    "vcfIndex": {
      "title": "VCF Index",
      "description": "An URL of the point mutation index file (.tbi).",
      "minLength": 1,
      "maxLength": 2083,
      "format": "uri",
      "type": "string"
    },
    "vcf2": {
      "title": "VCF2",
      "description": "An URL of the the indel file (.vcf).",
      "minLength": 1,
      "maxLength": 2083,
      "format": "uri",
      "type": "string"
    },
    "vcf2Index": {
      "title": "VCF2 Index",
      "description": "An URL of the indel index file (.tbi).",
      "minLength": 1,
      "maxLength": 2083,
      "format": "uri",
      "type": "string"
    },
    "bam": {
      "title": "BAM",
      "description": "An URL of the BAM file (.bam).",
      "minLength": 1,
      "maxLength": 2083,
      "format": "uri",
      "type": "string"
    },
    "bamIndex": {
      "title": "BAM Index",
      "description": "An URL of the BAM index file (.bai).",
      "minLength": 1,
      "maxLength": 2083,
      "format": "uri",
      "type": "string"
    },
    "note": {
      "title": "Note",
      "description": "A textual annotation.",
      "type": "string"
    }
  },
  "required": [
    "id",
    "cancer",
    "assembly",
    "sv",
    "cnv"
  ]
}

This means you get auto-validation/parsing on the python side:

from chromosope import Config

unknown_config = [ ... ]

parsed = [Config(**item) for item in unknown_config]
parsed # now know the config is parsed correctly

and could equally generate validators for the JS side from the JSON schema.