micro-batches Search Results

1000+ results
for micro-batches

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

microsoft/DeepSpeed #673

A request for clarity around 3D Parallelism in DeepSpeed

Let's start with saying that based on my reading of various papers Model Parallelism (MP) is a very inconsistent term. One can slice vertically or horizontally. One can implement a naive slow version …

stas00 updated 3 years ago
5
dmlc/gluon-nlp #1016

Bug in Bert finetuning on glue

## Description In the finetunine_classifier.py, it updates the metric each batch and get the result after each epoch both in train() and evaluate(). However, there are some wrong settings: 1. We …

zburning updated 4 years ago
2
SaxyPandaBear/sodium-intake #1

Redesign the application

# The Problem There has to be a better way than having a python script running 24/7 on an EC2 instance. Will need to look into different ways to get the data. The crux of the issue is how the da…

SaxyPandaBear updated 5 years ago
4
kbr-net/sdrive-max #53

Most newer SD cards does not initialize

I've found out the hard way that most newer micro SD cards reports as SD card init failed on the Sdrive, even from the same manufacturer/model/size that few months ago were working great. up to now P…

StmLord updated 10 months ago
15
Blealtan/RWKV-LM-LoRA #10

size mismatch for emb.weight: copying a param with shape tor…

when i run the RWKV-LM-Lora,i meet that error, the model i use: RWKV-4-Raven-1B5-v9-Eng99%-Other1%-20230411-ctx4096.pth my insruction run in wsl2： python3 train.py --load_model /home/wubo/chatRWKV…

miandui-WuBo updated 1 year ago
1
huggingface/nanotron #161

out of memory for continuing pretraining llama3-8B

I am trying to use the framework to continue pretraining llama3-8B. I have converted the HF checkpoint into nanotron format and the generated tokens seem reasonable. I use the following setting to…

ckzbullbullet updated 4 months ago
5
Lightning-AI/lightning-thunder #1248

[NeVa] [rank0]: TypeError: matmul(): argument 'input' (posit…

While preparing the benchmark for eager and dynamo using the code from the fork: https://github.com/tfogal/NeMo I get errors for dynamo case. ## 🐛 Bug After fixing [1187](https://github.com/Ligh…

wprazuch updated 3 weeks ago
2
karpathy/nanoGPT #285

Why does batch size affect convergence?

It is my understanding from the videos that batch size should have no influence on convergence. But I have cases where increasing the batch size will lead to underfitting, or where decreasing the batc…

0dB updated 1 month ago
10
slingdata-io/sling-cli #146

StarRocks: Use FILES to bulk load

https://docs.starrocks.io/docs/sql-reference/sql-functions/table-functions/files/

flarco updated 8 months ago
2
qubole/streaminglens #5

StreamingLens Insights always showing "Streaming Query Stat…

Hi All, I am using StreamingLens in my spark structure streaming application but it's always showing same logs .BatchId is getting updated but **Streaming Query State: NONEWBATCHES** remains same.…

rpatid10 updated 3 years ago
11

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for micro-batches

1000+ results
for micro-batches