redpajama Search Results

514 results
for redpajama

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

Lightning-AI/lit-llama #371

changing `devices` to `fabric.world_size` in the pretrain co…

Hello, according to our discussion [here](https://github.com/Lightning-AI/lit-llama/issues/330#issuecomment-1567376696), I think `devices` should be changed in the [pretraininig code](https://githu…

LamOne1 updated 1 year ago
1
Twenkid/Vsy-Jack-Of-All-Trades-AGI-Bulgarian-Internet-Archive-And-Search-Engine #15

Literature, References, Resources, Papers, Links, Links to L…

Note, 4.1.2023: During this research effort I've been browsing, reviewing, visiting and revisiting, studying a huge amount of articles, concepts,, linked by association during browsing etc. for feedin…

Twenkid updated 3 months ago
5
togethercomputer/RedPajama-Data #88

Token counts

Hey, thank you for making this data set available to the community. I'm wondering how you estimated the token counts in the table in the README and the blogpost? In particular, do you have the corres…

timsueberkrueb updated 12 months ago
2
Lightning-AI/litgpt #805

Weird error when using activation checkpointing for FSDPStra…

I'm training tinyllama with 8 A40s. Everything goes very smooth until I want to increase the micro batch size for better computation to communication ratio. I follow the official tutorial of lit …

RogerChern updated 2 days ago
2
hpcaitech/ColossalAI #5026

Question about text preprocess in examples/language/llama2 a…

### Describe the feature I found both the two examples will truncate text longer than max_length. So we have to segment long text to short ones. For examples/language/llama2, the codes are: ``` …

fancyerii updated 1 year ago
2
mlc-ai/mlc-llm #618

[Question] Can we build a Mac or Universal version of the iO…

## ❓ General Questions Thank you so much for this amazing project -- it's a complete game-changer when it comes to running LLM locally. I'd like to make this available to my non-technical colleague…

elepedus updated 1 month ago
19
togethercomputer/RedPajama-Data #82

Invalid argument when running cc_net

Hi everyone, I try to run the cc net using this command `python -m cc_net --dump 2023-06 --task_parallelism 20 --num_shards 5000 -l en --mine_num_processes 20 --hash_in_mem 1`. But the invalid argumen…

Practicinginhell updated 1 year ago
2
artidoro/qlora #78

training/eval loss doesn't decrease when using paged_adamw_8…

`paged_adamw_32bit` works perfectly though. Have tried this on multiple models and multiple datasets but all of them lead to same observation. Below results with redpajama_3B_base model and on same…

KKcorps updated 1 year ago
1
mlcommons/chakra #68

may i ask is there any tutorial or example for this project?

it too tough to get start may i ask is there any tutorial or example for this project? for example, how can i get the pytorch et from cluster, and how to convert it to chakra et? how to visuali…

tyn513 updated 5 months ago
3
Topology1225/news #10

20240209

Topology1225 updated 9 months ago
10

上一页 1...5 6 7 8 9 10 11...52 下一页

514 results for redpajama

514 results
for redpajama