micro-batches Search Results

1000+ results
for micro-batches

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

meta-llama/llama-models #78

A doubt about training infrastructure: Why more micro-batche…

Thank you for the detailed report for Llama3.1, which is very inspirational. I read the report and have a doubt about training infrastructure. In chapter 3.3.2 titled _Parallelism for Model Scaling_.…

eileenzhujuan updated 1 month ago
2
allenai/OLMo #699

slurm script for: configs/official/OLMo-7B.yaml

### ❓ The question do you know the slurm script for configs/official/OLMo-7B.yaml? looking for multi-node slurm script

andymvp2018 updated 1 month ago
3
kakaobrain/torchgpipe #35

The same batch size, different micro batches, the algorithm …

# 🐞 Bug The same batch size, different micro batches, the algorithm effects are inconsistent. I have fixed the random seed. I set chunks equal to 2 or 4 ## Code that reproduces ```python imp…

Kurama622 updated 4 months ago
3
tensorflow/lingvo #256

How to run microbatches on different gpus?

Hi, I've been experimenting with GPipe and was wondering if it is possible to run different micro-batches on different GPUS? For example if there are 16 micro-batches, is it possible to run 8 micro…

adis98 updated 3 years ago
3
microsoft/DeepSpeed #1051

Dynamic/variable batch size support

For the model I am training, I am relying on a custom [Sampler](https://pytorch.org/docs/stable/data.html#torch.utils.data.Sampler), that returns variable batch sizes. My task at hand is translation, …

ecly updated 2 weeks ago
17
apache/datafusion-comet #808

Improve performance of broadcast hash join

### What is the problem the feature request solves? Query: ```sql select ss_sold_date_sk, ss_sold_time_sk, ss_quantity, d_year, d_moy, d_dom from date_dim join store_sales on d_date_sk = ss_so…

andygrove updated 3 weeks ago
3
hlorus/CAD_Sketcher #483

[BUG] - Add a Sketch Impossible to Select the workplane

### Contact Details _No response_ ### Description **Description:** When I select "Add a Sketch," the planes are displayed, but it is impossible to select one. There is no hover effect, and clickin…

IGLOU-EU updated 1 day ago
26
microsoft/DeepSpeedExamples #924

AttributeError： 'DeepSpeedEngine' object has no attribute 'm…

https://github.com/microsoft/DeepSpeedExamples/blob/957ae3141946daf9a6bc5731e261032a13a82f05/applications/DeepSpeed-Chat/training/step1_supervised_finetuning/main.py#L367 just train one epoch，i got…

lovychen updated 1 week ago
1
NVIDIA/NeMo-Aligner #131

SFT is broken with container 24.01.01

**Describe the bug** A user reported a crash with 24.01.01 and SFT (while things work fine with 24.01): ``` File "/opt/NeMo-Aligner/examples/nlp/gpt/train_gpt_sft.py", line 215, in main in…

odelalleau updated 5 months ago
1
josStorer/RWKV-Runner #273

请问lora微调时候出现train.py error是什么问题?

显示如下 --load_model models/RWKV-5-1B5-one-state-slim-novel-tuned.pth --data_file ./finetune/json2binidx_tool/data/training staff_text_document --ctx_len 1024 --epoch_steps 800 --epoch_count 20 --epoch_…

Sakuranoame updated 7 months ago
3

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for micro-batches

1000+ results
for micro-batches