ML-Bioinfo-CEITEC / genomic_benchmarks

Benchmarks for classification of genomic sequences
Apache License 2.0
107 stars 14 forks source link

Add a possibility to a custom dataset #16

Closed simecek closed 2 years ago

simecek commented 2 years ago

Given BED file or several BED files, provide a function that would convert this into interval-type dataset (i.e. convert BED file into gzipped CSV file and do train/test split). Optionally, randomly generate negative controls.