issues
search
Lightning-AI
/
litdata
Streamline data pipelines for AI. Process datasets across 1000s of machines, and optimize data for blazing fast model training.
Apache License 2.0
249
stars
23
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Index Error when calling StreamingDataLoader.state_dict() when using custom collate_fn with multiple workers
#196
esivonxay-cognitiv
opened
5 hours ago
1
Add support for Mosaic Streaming WDS data format
#195
tchaton
opened
22 hours ago
0
Release LitData 0.2.14
#194
tchaton
closed
1 day ago
0
When providing a local path to the optimize method, make it work in a distributed settings for Jobs
#193
tchaton
opened
1 day ago
0
Fix: unexpected behaviours (bugs) in train_test_split fixed
#192
deependujha
closed
1 day ago
3
Add support for parquet files for storing the chunks
#191
tchaton
opened
1 day ago
0
Add utility to merge datasets together
#190
tchaton
closed
1 day ago
0
Bump version 0.2.13
#189
tchaton
closed
1 day ago
0
Fix: Resolve the default weights of the combined dataset
#188
tchaton
closed
1 day ago
0
Fix: error while splitting dataset with `splits=[0.1, 0.2, 0.7]` and support split of 0.0
#187
deependujha
closed
2 days ago
1
train_test_split fails when asked for `splits=[0.1, 0.2, 0.7]`
#186
deependujha
closed
2 days ago
1
Dataset weightage bug
#185
yhl48
closed
1 day ago
2
Feat: Append data to pre-optimize dataset
#184
deependujha
closed
1 day ago
1
LitData doesn't support s3 bucket connection outside server
#183
sanyalsunny111
opened
3 days ago
10
train_test_split doesn't support split of 0.0
#182
robmarkcole
closed
2 days ago
4
Using fsspec to download files
#181
samsja
opened
6 days ago
3
Feat: Append data to pre-optimize dataset
#180
deependujha
closed
3 days ago
2
Batch size beginning to vary half way through epoch
#179
MarcoForte
opened
1 week ago
6
Magic FileSerializer is causing issues
#178
tchaton
opened
1 week ago
1
Bump pypa/gh-action-pypi-publish from 1.8.14 to 1.9.0
#177
dependabot[bot]
closed
2 days ago
0
Release version 0.2.12
#176
tchaton
closed
1 week ago
0
AttributeError: `np.sctypes` was removed in the NumPy 2.0 release.
#175
bhimrazy
opened
1 week ago
8
Update README with config for MinIO
#174
bhimrazy
closed
1 week ago
1
Resolve num_workers when the user provides 0
#173
tchaton
closed
2 weeks ago
0
Warning Message When Using StreamingDataset with DDP
#172
taemincho
opened
2 weeks ago
2
Remove crt
#171
tchaton
closed
2 weeks ago
0
Add Example to support with MinIO Configuration
#170
bhimrazy
closed
1 week ago
1
fix typo : aws cli url in readme
#169
bhimrazy
closed
2 weeks ago
1
Typo: AWS CLI url points to s3 url
#168
bhimrazy
closed
2 weeks ago
1
Bump version 0.2.10
#167
tchaton
closed
2 weeks ago
0
Fix litdata on colab
#166
tchaton
closed
2 weeks ago
0
Bump version 0.2.9
#165
tchaton
closed
2 weeks ago
0
(fix) CombinedDataset with more than 2 streaming datasets
#164
tchaton
closed
2 weeks ago
0
Add support for custom collate with the StreamingDataLoader
#163
tchaton
closed
2 weeks ago
0
Remove DataLoader example in README
#162
tchaton
closed
2 weeks ago
0
Add feature to slice, subsample and split dataset
#161
deependujha
closed
1 week ago
2
Add first draft for multi modal model training text & image
#160
rakro101
closed
2 weeks ago
3
LitData leaves a `status.json` in current working dir
#159
awaelchli
opened
3 weeks ago
2
ValueError: The provided None isn't supported.
#158
awaelchli
opened
3 weeks ago
1
Error: All weights must be positive
#157
awaelchli
opened
3 weeks ago
2
add support for slicing, subsampling and splitting StreamingDataset
#156
deependujha
closed
2 weeks ago
5
Flexibility to set device number < total device count
#155
yhl48
closed
3 weeks ago
3
compatibility with downloading data from gcp
#154
dangthatsright
closed
3 weeks ago
2
LitData release version bump 0.2.8
#153
tchaton
closed
3 weeks ago
0
Set number of devices
#152
yhl48
closed
3 weeks ago
3
Change cache place litdata
#151
tchaton
closed
3 weeks ago
0
Bump lightning-cloud from 0.5.68 to 0.5.69
#150
dependabot[bot]
closed
2 weeks ago
0
Bump coverage from 7.5.0 to 7.5.3
#149
dependabot[bot]
closed
2 weeks ago
1
Bump pytest from 8.2.0 to 8.2.1
#148
dependabot[bot]
closed
3 weeks ago
0
Fix: Resolve drop_last not passed down from the StreamingDataLoader to the datasets
#147
tchaton
closed
3 weeks ago
0
Next