-
track
-
-
# sh training/finetune_Pythia-Chat-Base-7B.sh
Namespace(use_cuda=True, cuda_id=0, cuda_num=1, debug_mem=True, dist_backend='cupy_nccl', dp_backend='nccl', dist_url='tcp://127.0.0.1:7033', world_size=…
-
Context:
@snat-s has done great work w/ the analysis of various data that may be relevant to Neko
We should now, w/ the input of the team, finalize our proposed V0 dataset, justify its
Output: doc…
-
### News
- Conference 소식
- [CHI 2023](https://chi2023.acm.org/): 독일 함부르크, 4.23 - 28
- [ICLR 2023](https://iclr.cc/): 르완다 키갈리(Aㅏ), 5.1-5
- Google Deepmind!!!
- Google Brain 과 Deepmind가 하나의 팀…
-
- [ ] [blog/mteb.md at main · huggingface/blog](https://github.com/huggingface/blog/blob/main/mteb.md?plain=1)
# Title: blog/mteb.md at main · huggingface/blog
**Description:**
"---
title: "MTEB: …
-
- [ ] [cohere-ai/quick-start-connectors: This open-source repository offers reference code for integrating workplace datastores with Cohere's LLMs, enabling developers and businesses to perform seamle…
-
你好,我使用的是样例测试集,想跑通README. 但是发现,在训练的时候,会卡住,然后超时;
[batch=23/3200]:
Train time/batch: 22
Train time/sample: 198
Train time/batch_in_epoch: 6
Train time/sample_in_e…
-
hi,in the paper you said “we use 4096 samples from RedPajama with a context length of 2048”, is it enough
for QAT?
-
100%|███████████████████████████████████████| 933M/933M [01:59 [18](https://file+.vscode-resource.vscode-cdn.net/home/mraway/Desktop/src/open_flamingo/~/.cache/huggingface/modules/transformers_modules…