EnsemblGSOC / Ensembl-Repeat-Identification

A Deep Learning repository for predicting the location and type of repeat sequence in genome.
4 stars 3 forks source link

dataset statistics #29

Closed williamstark01 closed 2 years ago

williamstark01 commented 2 years ago

A Jupyter Notebook with some statistics from the hg38 hits dataset, it can be extended as needed.

williamstark01 commented 2 years ago

Attaching an HTML export of the notebook for easy viewing: dataset_statistics.html.zip

yangtcai commented 2 years ago

LGTM! I will add some repeat analysis later :D