jmschrei / tfmodisco-lite

A lite implementation of tfmodisco, a motif discovery algorithm for genomics experiments.
MIT License
56 stars 16 forks source link

works with variable length sequences #46

Open akmorrow13 opened 8 months ago

akmorrow13 commented 8 months ago

This code accounts for cases where sequences in the one hot encodings are variable length. In these cases, we don't include seqlets that don't overlap any bp for a sequence. Note that this will still include seqlets that overhang off the end of the sequence (ie a seqlet ATGCNNNNNNNN would be valid)