-
I admire and am interested in your work and would like to follow up on your work. Will you make the pre-training code and training dataset public?
-
Can you provide the pre-training parameter file of the model?
-
- [x] Make an account on GitHub and write a comment on this ticket.
- [x] Set up Linux environment with GPU (Ubuntu 22.0 is recommended)
- [x] Install the git command in linux environment and clone …
-
Why is it that after following the process of downloading the pre-trained model and then starting to train the CSL-Daily dataset, the message “!!!! pred_b: can not be converted, got head.projection.bi…
-
Thank you very much for your open source code. May I ask if there are weight parameters obtained from the decoder and projector sections during pre training?
-
Hi, I found in one literature review that they categorised your model as "Self-supervised generative", but in your article as I understood, you are using labels during pre-training (so basically the s…
-
Fine-tuning the `ai-forever/mGPT-1.3B-azerbaijan` model on your extensive Azerbaijani corpus is an excellent way to enhance the model’s capability in Azerbaijani. Since you’re looking to do this in an…
-
Hello, I'd like to conduct pre-training on the st-mem model using the same dataset. I have all the datasets ready. My question pertains to the contents of st_mem/configs/pretraining.yaml. Could you pl…
-
## Acceptance Criteria
- [x] Provide more representative file list to [David Reiss](https://dsva.slack.com/archives/C04KW0B46N5/p1718746300362559?thread_ts=1718652379.182899&cid=C04KW0B46N5)
- [ ] C…
-
Hi Team,
It is amazing handbook. In the continued pre-training script (`run_cpt.py`), I saw that it is not using "mlm" (Masked Language Model) parameter in the training process. I though that the …