EnsemblGSOC / Ensembl-Repeat-Identification

A Deep Learning repository for predicting the location and type of repeat sequence in genome.
4 stars 3 forks source link

Filter the no repeats fragments and add test stage #28

Closed yangtcai closed 2 years ago

yangtcai commented 2 years ago

I divide the dataset into train and test, so we can see how it performs on unseen sequences :D

yangtcai commented 2 years ago

BTW, pandas is really great, when we filter the no repeats subsequence only needs 5 min on the laptop, so it makes sense!!!!

williamstark01 commented 2 years ago

BTW, pandas is really great, when we filter the no repeats subsequence only needs 5 min on the laptop, so it makes sense!!!!

Yep, it's ultra optimized, very fast in most cases.