Lightning-AI / litdata

Streamline data pipelines for AI. Process datasets across 1000s of machines, and optimize data for blazing fast model training.
Apache License 2.0
249 stars 24 forks source link

Fix: error while splitting dataset with `splits=[0.1, 0.2, 0.7]` and support split of 0.0 #187

Closed deependujha closed 4 days ago

deependujha commented 4 days ago
Before submitting - [ ] Was this discussed/agreed via a Github issue? (no need for typos and docs improvements) - [ ] Did you read the [contributor guideline](https://github.com/Lightning-AI/lit-data/blob/main/.github/CONTRIBUTING.md), Pull Request section? - [ ] Did you make sure to update the docs? - [ ] Did you write any new necessary tests?

What does this PR do?

Under active development. 🚧

Fixes #182 & Fixes #186

Fixes error while splitting dataset with splits=[0.1, 0.2, 0.7] and other small fixes. Feature: Adds support for split of 0.0

PR review

Anyone in the community is free to review the PR once the tests have passed. If we didn't discuss your PR in GitHub issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

codecov[bot] commented 4 days ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Please upload report for BASE (main@ce79db9). Learn more about missing BASE report.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #187 +/- ## ===================================== Coverage ? 78% ===================================== Files ? 33 Lines ? 4327 Branches ? 0 ===================================== Hits ? 3366 Misses ? 961 Partials ? 0 ```