NVIDIA / aistore

AIStore: scalable storage for AI applications
https://aistore.nvidia.com
MIT License
1.21k stars 160 forks source link

docs: update etl init usage instructions #164

Closed soumyendra98 closed 6 months ago

soumyendra98 commented 6 months ago

Fix for ais etl init - missing --comm-type argument

Fixes in - https://github.com/NVIDIA/aistore/blob/main/docs/etl.md Inline ETL example Step 5

ais etl init spec --from-file md5_spec.yaml --name etl-md5 → ais etl init spec --from-file md5_spec.yaml --name etl-md5 --comm-type hpull ETL[etl-md5]: job "etl-tKiJJphmR"

Offline ETL example

cat code.py → cat > code.py

cat deps.txt → cat > deps.txt

ais etl init code --name etl-torchvision --from-file code.py --deps-file deps.txt --runtime python3.11v2 → ais etl init code --name etl-torchvision --from-file code.py --deps-file deps.txt --runtime python3.11v2 --comm-type hpull ETL[etl-torchvision]: job "etl-SnLbvA7gz"

ETL name specifications

ais etl init code --name=etl-md5 --from-file=code.py --runtime=python3 --deps-file=deps.txt → ais etl init code --name=etl-md5 --from-file=code.py --runtime=python3 --deps-file=deps.txt --comm-type hpull

ais etl init spec --name=etl-md5 --from-file=spec.yaml → ais etl init spec --name=etl-md5 --from-file=spec.yaml --comm-type hpull

Fixes in - https://github.com/NVIDIA/aistore/blob/main/docs/cli/etl.md Init ETL with Code Example

ais etl init code --from-file=code.py --runtime=python3.11v2 --name=transformer-md5 → ais etl init code --from-file=code.py --runtime=python3.11v2 --name=transformer-md5 --comm-type hpull

ais etl init code --name=etl-md5 --from-file=code.py --runtime=python3.11v2 --chunk-size=32768 --before=before --after=after → ais etl init code --name=etl-md5 --from-file=code.py --runtime=python3.11v2 --chunk-size=32768 --before=before --after=after --comm-type hpull

Fixes in - https://github.com/NVIDIA/aistore/blob/main/docs/tutorials/etl/compute_md5.md Simplified flow

ais etl init code --from-file=code.py --runtime=python3.11v2 --name=transformer-md5 → ais etl init code --from-file=code.py --runtime=python3.11v2 --name=transformer-md5 --comm-type hpull

Fixes in - https://github.com/NVIDIA/aistore/blob/main/docs/tutorials/etl/etl_imagenet_pytorch.md Transform Dataset

ais etl init code --from-file=code.py --deps-file=deps.txt --runtime=python3.11v2 --name="pytorch-transformer" → ais etl init code --from-file=code.py --deps-file=deps.txt --runtime=python3.11v2 --name="pytorch-transformer" --comm-type hpull

gaikwadabhishek commented 6 months ago

merged this into main