igemmcmaster / genome-transformer

Pretrained efficient transformers on genomes -- WIP
3 stars 2 forks source link

Make tokenizer for gene sequences #2

Open Lev1ty opened 3 years ago

Lev1ty commented 3 years ago

Create a tokenizer for base pairs of gene sequences.

Lev1ty commented 3 years ago

Gather wet lab expertise on position embeddings

Lev1ty commented 3 years ago

Deliverables

Encode base pairs

Encode position of base pairs