sangmichaelxie / doremi

Pytorch implementation of DoReMi, a method for optimizing the data mixture weights in language modeling datasets
https://arxiv.org/abs/2305.10429
MIT License
286 stars 32 forks source link

List of pinned requirements / Dockerfile? #19

Closed filipg7777 closed 7 months ago

filipg7777 commented 8 months ago

I'm struggling to replicate DoReMi traning (weird errors, probably due to some incompatibility between dependencies). Is there a list of pinned requirements (in particular what should be the version of flash_attn?). or, preferably, a Dockerfile for easy reproduction.

filipg7777 commented 8 months ago

Sorry, didn't notice that flash_attn should be v2.0.0. Anyway, the main question remains.

sangmichaelxie commented 8 months ago

The requirements should be here: https://raw.githubusercontent.com/sangmichaelxie/doremi/main/setup.py