distributed-shuffle Search Results

1000+ results
for distributed-shuffle

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

matrixorigin/matrixone #20450

[Bug]: Inner join query execution result error: stream close…

### Is there an existing issue for the same bug? - [X] I have checked the existing issues. ### Branch Name 2.0-dev ### Commit ID 96a0524287e30a4892c6f5365541a2d221ed4c37 ### Other Environment In…

qingxinhome updated 1 day ago
1
lakehq/sail #246

Implement the MVP for distributed processing

## User Interface - [x] Sail CLI (#245) - [x] Sail configuration (#279) ## Core Functionalities - [x] Distributed processing setup (#244) - [x] Distributed job stages and shuffle (#265) - [x] …

linhr updated 1 week ago
2
jwohlwend/boltz #15

Problems Running on a 4xA100 (80 GB) node...

Input MSAs were truncated to be a single entry (duplicates of the input sequences) because leaving `msa:` blank causes errors for some reason. ``` >101 MGDWSALGRLLDKVQAYSTAGGKVWLSVLFIFRILLLGTAVESA…

amelie-iska updated 1 day ago
11
dask/dask #9888

Reuse of keys in blockwise fusion can cause spurious KeyErro…

Subsequent Blockwise layers are currently fused into a single layer. This reduces the number of tasks, the overhead and is very generally a good thing to do. Currently, the fused output does not gener…

fjetter updated 2 days ago
17
RuiyingLu/HVQ-Trans #5

作者您好！sampler = DistributedSampler(dataset).set_epoch(epoch) …

因为根据代码 if distributed: sampler = DistributedSampler(dataset) # 似乎没有设置乱序，shuffle应该是默认为false else: sampler = RandomSampler(dataset) 同时谢谢您的杰出工作！

ZHE-SAPI updated 1 month ago
1
SalesforceAIResearch/DiffusionDPO #15

Potential Issue on data loader in distributed setting.

Hello, It seems that the dataloader is not adapted to distributed setting (Line 881 at train.py). The data entries will be repeatedly loaded and trained by different processes. Maybe a sampler sho…

dingyuan-shi updated 2 weeks ago
4
rapidsai/cudf #17415

[BUG] Error with dask-expr using categorical dtype

This is the same issue as https://github.com/rapidsai/dask-cuda/issues/1408 . Cross-posting here as it's more related to cuDF instead of `dask-cuda`. The following snippet works with `DASK_DATAFRAME_…

trivialfis updated 3 days ago
3
NVIDIA/NeMo #11303

OOM with RAM with Lhotse

**Describe the bug** When training a model consuming more memory, I noticed that my training would stop after a constant number of epochs. Upon further investigation, I found that during training / v…

riqiang-dp updated 1 day ago
3
RUCAIBox/RecBole #2063

[🐛BUG] 分布式训练后的数据加载问题

**报错信息如下** ``` Traceback (most recent call last): File "/data1/bert4rec/bert4rec-main/scripts/bole/loaddata_run_product.py", line 5, in config, model, dataset, train_data, valid_data, test_…

zw81929 updated 2 months ago
2
NVIDIA/Megatron-LM #907

[BUG] GPTDataset._build_document_sample_shuffle_indices does…

**Describe the bug** If the training data does not live on NFS but on node-specific storage, the current logic in https://github.com/NVIDIA/Megatron-LM/blob/0bc3547702464501feefeb5523b7a17e591b21fa/m…

dementrock updated 1 week ago
6

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for distributed-shuffle

1000+ results
for distributed-shuffle