distributed-shuffle Search Results

dask/distributed #8869

`distributed/shuffle/tests/test_rechunk.py::test_rechunk_aut…

I noticed `main` is failing with this error ``` __________________________________________ test_rechunk_auto_1d[20-chunks4-5-expected4] __________________________________________ c = , s = , sh…

jrbourbeau updated 3 days ago

aws-samples/awsome-distributed-training #425

Assigning Different Microbatches to Each Rank

### Context: We are following the [FSDP example](https://github.com/aws-samples/awsome-distributed-training/tree/main/3.test_cases/10.FSDP) and trying to understand the mechanism behind how differe…

purefall updated 2 days ago

RUCAIBox/RecBole #2063

[🐛BUG] 分布式训练后的数据加载问题

**报错信息如下** ``` Traceback (most recent call last): File "/data1/bert4rec/bert4rec-main/scripts/bole/loaddata_run_product.py", line 5, in config, model, dataset, train_data, valid_data, test_…

zw81929 updated 1 week ago

dask/dask #10014

P2P shuffle fails on frames containing mixed dtype columns

**Describe the issue**: The `p2p` shuffle algorithm fails when working with frames that contain columns of mixed numeric/string dtypes; this error occurs during computation when Arrow attempts to…

charlesbluca updated 1 month ago

keras-team/keras #20142

PyDataset Documentation and Best Practices

**Keras Version:** 3.5.0 **Tensorflow Version:** 2.17.0 **What I want to do:** Use PyDataset class in a data distributed environment. --- I would like to ask about the status of PyDataset a…

dryglicki updated 3 days ago

dask/dask #9888

Reuse of keys in blockwise fusion can cause spurious KeyErro…

Subsequent Blockwise layers are currently fused into a single layer. This reduces the number of tasks, the overhead and is very generally a good thing to do. Currently, the fused output does not gener…

fjetter updated 2 weeks ago

epfLLM/Megatron-LLM #107

Assert in verify_correctness for mistral-7B

I am following the [getting started](https://epfllm.github.io/Megatron-LLM/guide/getting_started.html) guide with mistal-7B model. - I am able to (1) convert `mistralai/Mistral-7B-v0.1` and (2) …

abgoswam updated 1 week ago

apache/datafusion #12454

Proposal: Hook to better support `CollectLeft` joins in dist…

### Is your feature request related to a problem or challenge? Suppose you are building a distributed query engine on top of DataFusion and you want to run a query like ``` SELECT facts.fact_value,…

thinkharderdev updated 3 hours ago

dask/distributed #6405

`distributed/shuffle/tests/test_shuffle.py::test_clean_after…

Occurred on https://github.com/dask/distributed/pull/6400 but I've seen it in other PRs as well https://github.com/dask/distributed/runs/6524459060?check_suite_focus=true ``` ______________________…

fjetter updated 2 years ago

pytroll/trollflow2 #209

log settings in logging_on() not inherited by dask distribut…

**Describe the bug** The log settings defined by `logging_on()`, and therefore by any trollflow2 process, are not inherited by tasks scheduled using dask.distributed when called inside an `if __nam…

gerritholl updated 1 month ago

1000+ results for distributed-shuffle

1000+ results
for distributed-shuffle