-
I'm having trouble with the bge-m3 train.
Accordingly, I would like to ask you a few questions.
1. m3 train code
I learn the bge-m3 model on the H100 (80GB) * 8 server.
Below is the learning s…
-
### 🔎 Search before asking
- [X] I have searched the PaddleOCR [Docs](https://paddlepaddle.github.io/PaddleOCR/) and found no similar bug report.
- [X] I have searched the PaddleOCR [Issues](https…
Cupcc updated
2 weeks ago
-
### Describe the question you meet
I use the CWD method,When resnet50 is used to distill resnet18, the training accuracy of the teacher's network is 80%, but the network accuracy after distillation…
-
https://proceedings.mlsys.org/paper_files/paper/2023/file/523f87e9d08e6071a3bbd150e6da40fb-Paper-mlsys2023.pdf
-
- [ ] [I'm the author of the GPT-2 work. This is a nice post, thanks for making it more... | Hacker News](https://news.ycombinator.com/item?id=39436215)
# TITLE
I'm the author of the GPT-2 work. Thi…
-
- [ ] [LoRA Land: Fine-Tuned Open-Source LLMs that Outperform GPT-4 - Predibase - Predibase](https://predibase.com/blog/lora-land-fine-tuned-open-source-llms-that-outperform-gpt-4)
# LoRA Land: Fine…
-
- [ ] [sentence-transformers/README.md at master · liuyukid/sentence-transformers](https://github.com/liuyukid/sentence-transformers/blob/master/README.md?plain=1)
# sentence-transformers/README.md a…
-
# ChatGPT is fun, but it is not funny! Humor is still challenging Large Language Models
2023 Workshop on Computational Approaches to Subjectivity, Sentiment
“oxymoron” Despite being fun to interact …
-
Hi @nreimers, Hi Sentence-transformers community,
First of all, I want to thank you for your continued support throughout the years. I have been following this repository for three years now and I'…
-
Hi everyone,
This is a small question related to how models are fine-tuned during the first step of training. I see that the default loss function is `losses.CosineSimilarityLoss`. But when generat…