-
### Description
Make `ray.data.from_arrow` efficiently support iterators of Arrow tables. Currently, `from_arrow` loads all of the data in memory.
### Use case
If your data can't fit in memor…
-
[AWS Glue](https://aws.amazon.com/glue/features/) seems really useful especially it's fuzzy FindMatches feature, ([although LLM based cosine similarity embeddings should provide similar features](http…
-
你好,在复现longhua2数据,运行bash/3_2_cross_view_process.sh遇到了错误
在gp_nerf/datasets/dataset_utils.py get_rgb_index_mask_depth_dji_instance_crossview 函数中 keep_mask = metadata.load_mask()返回为None,后续代码使用torch.…
LCsee updated
4 weeks ago
-
**Describe the bug**
tokenizer map in `hf_decoder_model` use multi `preprocessing_num_workers` will return `TypeError: cannot pickle 'torch._C._distributed_c10d.ProcessGroup' object`
**To Reprodu…
-
Hi ! I'm Quentin from Hugging Face :)
Congrats on this project, this has the potential to help the community so much ! Especially with large scale and multimodal datasets.
I was wondering if you…
-
http://spark.apachecn.org/paper/zh/spark-rdd.html
Spark 中文文档 - Spark 官方文档中文版 - - ApacheCN
-
### System Info
```shell
optimum-habana 1.14.0.dev0
HL-SMI Version: hl-1.18.0-fw-53.1.1.1
Driver Version: 1.18.0-ee698fb
```
### Information
- [X] The off…
-
In a distributed setting, `len(dataloader)` will return:
- `len(dataset) // (batch_size * num_GPUs)` if `dataset` is a map-style dataset
- `len(dataset) // batch_size` if `dataset` is a datapipe
…
-
**Star RTDETR**
请先在RTDETR主页点击**star**以支持本项目
Star RTDETR to help more people discover this project.
---
**Describe the bug**
在drone_detection数据集上使用多卡训练时,第一轮训练中途显卡利用率卡在100%,然后超时报错,但单卡训练正常;使用co…
-
**Describe the bug**
Follow this [doc](https://github.com/Vision-CAIR/MiniGPT-4/blob/main/dataset/README_2_STAGE.md) , prepare finetune data
[cc_sbu_align.zip](https://github.com/Vision-CAIR/Mini…