Closed amssljc closed 1 month ago
In most cases, if your data distribution is not very different from our training data (reference genomes from GenBank), using the BPE tokenizer should be fine.
In most cases, if your data distribution is not very different from our training data (reference genomes from GenBank), using the BPE tokenizer should be fine.