Lightning-Universe / lightning-flash

Your PyTorch AI Factory - Flash enables you to easily configure and run complex AI recipes for over 15 tasks across 7 data domains
https://lightning-flash.readthedocs.io
Apache License 2.0
1.74k stars 212 forks source link

build(deps): update datasets requirement from <=2.12.0,>=2.0.0 to >=2.0.0,<=2.13.0 in /requirements #1609

Closed dependabot[bot] closed 1 year ago

dependabot[bot] commented 1 year ago

Updates the requirements on datasets to permit the latest version.

Release notes

Sourced from datasets's releases.

2.13.0

Dataset Features

  • Add IterableDataset.from_spark by @​maddiedawson in huggingface/datasets#5770

    • Stream the data from your Spark DataFrame directly to your training pipeline
    from datasets import IterableDataset
    from torch.utils.data import DataLoader
    

    ids = IterableDataset.from_spark(df) ids = ids.map(...).filter(...).with_format("torch") for batch in DataLoader(ids, batch_size=16, num_workers=4): ...

  • IterableDataset formatting for PyTorch, TensorFlow, Jax, NumPy and Arrow:

    from datasets import load_dataset
    

    ids = load_dataset("c4", "en", split="train", streaming=True) ids = ids.map(...).with_format("torch") # to get PyTorch tensors - also works with tf, np, jax etc.

  • Add IterableDataset.from_file to load local dataset as iterable by @​mariusz-jachimowicz-83 in huggingface/datasets#5893

    from datasets import IterableDataset
    

    ids = IterableDataset.from_file("path/to/data.arrow")

  • Arrow dataset builder to be able to load and stream Arrow datasets by @​mariusz-jachimowicz-83 in huggingface/datasets#5944

    from datasets import load_dataset
    

    ds = load_dataset("arrow", data_files={"train": "train.arrow", "test": "test.arrow"})

Experimental

General improvements and bug fixes

... (truncated)

Commits


Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
codecov[bot] commented 1 year ago

Codecov Report

Merging #1609 (4d45ad7) into master (61ba676) will increase coverage by 22%. The diff coverage is n/a.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #1609 +/- ## ======================================== + Coverage 62% 84% +22% ======================================== Files 291 291 Lines 12876 12876 ======================================== + Hits 7972 10792 +2820 + Misses 4904 2084 -2820 ```