-
Hello
I read the code&paper and find that PLM is trainable when in the pretraining stage.Do you try to freeze PLM and only train text projection layer in the pretraining stage? If you have tried, h…
-
Thanks for the great paper and code! I have a query about the moco v3 encoder- in the paper it mentions the latent representations are regularized on a hyper-sphere. I am fairly new to moco v3, can yo…
-
### 📚 The doc issue
I want to continue pretraining llama2 with my own domain data. My data are about 1 billion tokens. To avoid catastrophic forgetting, we should add some pretraining data. But Meta …
-
**Is your feature request related to a problem? Please describe.**
`_gen_json_object` dictates key order.
**Describe the solution you'd like**
Instead, allow the LLM to dictate it, reducing perpl…
wjn0 updated
5 months ago
-
Hello,
Do you support finetuning with LoRA?
Or I can directly use your training code for finetuning (but I am afraid that the model will forget too much knowledge in the pretraining phase)
B…
-
Pose a question about one of the following articles:
“[Online images amplify gender bias](https://www.nature.com/articles/s41586-024-07068-x),” 2024. Guilbeault, Douglas, Solène Delecourt, Tasker …
-
● Recent works which leverage the large-scale image-text pairs pre-training such as CLIP shows promising performance in classification, segmentation and depth estimation.
● How to transfer the pretrai…
-
I'm trying to run pretraining with Resnet50 with my data, and running into out-of-memory issues with this.
Initially, I was using two V100s (32 GB) and the maximum batch size I could go to was 256…
-
Here are some ideas and potential areas of research for Tensort:
- Model analysis and interpretability: Develop new techniques for analyzing and understanding what large language models have learned …
-
## 집현전 최신반 스터디
- 2022년 5월 29일 일요일 10시
- 김상원님 남창현님 김영석님 발표
- 논문 링크: https://arxiv.org/abs/2203.15827
> ### Abstract
> Language model (LM) pretraining can learn various knowledge from text corpor…