ML-Bioinfo-CEITEC / genomic_benchmarks

Benchmarks for classification of genomic sequences
Apache License 2.0
118 stars 17 forks source link

Human ochr ensembl regulatory dataset #8

Closed katarinagresova closed 2 years ago

simecek commented 2 years ago

Check distribution length & train basic CNN (to be sure it can be predicted)

katarinagresova commented 2 years ago

CNN is unable to learn. Will not use this dataset.

katarinagresova commented 2 years ago

There might be a problem with PyTorch code.

katarinagresova commented 2 years ago

Sorry for so many changed files. I needed to rebase main to get code for experiments. Files important for this review are:

I was not able to make PyTorch CNN learn something from this data, but TF got accuracy 73%.