HazyResearch / hyena-dna

Official implementation for HyenaDNA, a long-range genomic foundation model built with Hyena
https://arxiv.org/abs/2306.15794
Apache License 2.0
590 stars 83 forks source link

Questions about pre-training with multiple sequences #69

Open HelloWorldLTY opened 5 months ago

HelloWorldLTY commented 5 months ago

Hi, I wonder if it is possible to pretrain this model with different sequences (different fa file from different species). I find there are short descriptions but no instructions in the readme file. Can I direclty replace the hg38.ml.fa file in the sequence folder with my own series of fa files? Thanks.

HelloWorldLTY commented 5 months ago

Is my requirement same as the multi-species training dataset? Thanks.