kipoi / kipoi-veff

Variant effect prediction plugin for Kipoi
https://kipoi.org/veff-docs
MIT License
6 stars 5 forks source link

indel support: dataloader-utility: MutationDatasetMixin #3

Open krrome opened 6 years ago

krrome commented 6 years ago

A mixin to kipoi.data.*Dataset classes, which defines a function that returns model input for both alleles (and reverse-complementation).

The difference for the MutationDatasetMixin-methods is that the key inputs will be replaced with the keys: inputs_ref, inputs_alt (optionally also: inputs_ref_rc, inputs_alt_rc). Which all contain the identical structure and their data corresponds to the reference and alterenative (optionally also in reverse-complement) of the model input data.

So additionally the current Dataloader output schema:

{ 
    "inputs": <some_obj>, 
    "targets": <some_obj>, 
    "metadata": {...}
}

there will be a method returning dictionaries of:

{ 
    "inputs_ref": <some_obj>, 
    "inputs_alt": <some_obj>, 
    "inputs_ref_rc": <some_obj>, 
    "inputs_alt_rc": <some_obj>, 
    "targets": <some_obj>, 
    "metadata": {...}
}

All relationships between inputs and metadata etc. have to hold identically for the newly defined inputs_* keys.