In the paper you state, "In order to facilitate further research on large-scale genome foundational models, we have collated and made available multi-species genome datasets for both pre-training of models (Sec. 4.1) and benchmarking (Sec. 4.2)."
but I cannot see where these datasets are, I have looked both on Huggingface and your github?
In the paper you state, "In order to facilitate further research on large-scale genome foundational models, we have collated and made available multi-species genome datasets for both pre-training of models (Sec. 4.1) and benchmarking (Sec. 4.2)."
but I cannot see where these datasets are, I have looked both on Huggingface and your github?
Have I overlooked them somewhere?