Open Lisennlp opened 8 months ago
Thank you for the interest! Our Jax training code is largely based on T5x (https://github.com/google-research/t5x), so we can do nothing on it. (I personally think it is actually easy to use. Maybe you can check T5x documentation.) And if you are interested in using pytorch instead, we have a demo here (https://colab.research.google.com/drive/1xIfIVafnlCP2XVICmRwkUFK3cwTJYjCY). You can use it easily by running sth like: https://github.com/XueFuzhao/OpenMoE?tab=readme-ov-file#inference-with-pytorch
Hope this helps :)
Thanks Reply. I followed the readme of tpu.
git clone https://github.com/XueFuzhao/OpenMoE.git \ bash OpenMoE/script/run_pretrain.sh
What I understand is that the configuration file is: t5x/t5x/examples/t5/t5_1_1/examples/openmoe_large.gin.
Therefore, I modified the sentence.model path inside, but the path you set still appears (gs://fuzhao/tokenizers/umt5.256000/sentencepiece.model).
In addition, for example, if I want to change the c4 datasets for training, it will not work if I directly change MIXTURE_OR_TASK_NAME='c4'.
For people who are not familiar with this code, it would be very nice if there is a more detailed readme. I hope the author can consider it. Thank you.
I am very interested in this work, but I found that the code structure is too deep, many configs are hidden deeply, and they are all hard-coded, making it difficult to run.