frederikkemarin / BEND

Benchmarking DNA Language Models on Biologically Meaningful Tasks
BSD 3-Clause "New" or "Revised" License
95 stars 14 forks source link

add `remove_cls_token` flag. #12

Closed fteufel closed 1 year ago

fteufel commented 1 year ago

This adds a remove_cls_token argument to DNABertEmbedder and NucleotideTransformerEmbedder. The argument is default True

Default behaviour of DNABert was to keep the CLS token in the output. Check that all downstream use of DNABert embeddings works before merge.

fteufel commented 1 year ago

Support upsampling of BERT embeddings to match original sequence length.