issues
search
huggingface
/
chug
Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.
Apache License 2.0
138
stars
9
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
[Bug]: requires pyarrow >=14.0.0
#7
NMontanaBrown
opened
3 months ago
2
Update README.md
#6
rwightman
closed
3 months ago
0
Improving seed handling for sample shuffle so can be deterministic.
#5
rwightman
closed
3 months ago
0
Create LICENSE
#4
rwightman
closed
3 months ago
0
Bam!
#3
rwightman
closed
3 months ago
0
[feature] Add dataloader support for non-webdataset dataset
#2
molbap
closed
3 months ago
1
multi-page images are returned as list instead of torch.Tensor
#1
molbap
closed
3 months ago
1