facebookresearch / habitat-lab

A modular high-level library to train embodied AI agents across a variety of tasks and environments.
https://aihabitat.org/
MIT License
1.91k stars 478 forks source link

Dataset Downloader #1797

Open Lr-2002 opened 7 months ago

Lr-2002 commented 7 months ago

Habitat-Lab and Habitat-Sim versions

Habitat-Lab: v0.3.0

Habitat-Sim: origin/main

Habitat is under active development, and we advise users to restrict themselves to stable releases of Habitat-Lab and Habitat-Sim. The bug you are about to report may already be fixed in the latest version.

Master branch contains 'bleeding edge' code, but we do appreciate bug reports for it!

🐛 Bug

Place :

China Mainland, While using the autodl accelerator proxy

When running the "python examples/example.py" It turn out

(habitat) root@autodl-container-3fb9439642-317b707a:~/habitat-lab# python examples/example.py 
2024-02-08 11:56:34,752 Initializing dataset RearrangeDataset-v0
2024-02-08 11:56:34,752 Rearrange task assets are not downloaded locally, downloading and extracting now...
No data-path provided, defaults to: ./data. Use '--data-path' to specify another location.
Found the existing repo for (replica_cad_dataset): /root/habitat-lab/data/versioned_data/replica_cad_dataset
 checking out v1.6 and pulling changes from repo.
=======================================================
Not replacing data, generating symlink (/root/habitat-lab/data/replica_cad).
Generating symlink (/root/habitat-lab/data/replica_cad).
=======================================================
git clone --depth 1 --branch v2.0 https://huggingface.co/datasets/ai-habitat/hab_fetch.git /root/habitat-lab/data/versioned_data/hab_fetch
Cloning into '/root/habitat-lab/data/versioned_data/hab_fetch'...
remote: Enumerating objects: 78, done.
remote: Counting objects: 100% (78/78), done.
remote: Compressing objects: 100% (70/70), done.
remote: Total 78 (delta 8), reused 77 (delta 8), pack-reused 0
Unpacking objects: 100% (78/78), 1.00 MiB | 1019.00 KiB/s, done.
Note: switching to 'd3f7c41602e6eea1a46f8f9dfc4d8fb2778c6773'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:

  git switch -c <new-branch-name>

Or undo this operation with:

  git switch -

Turn off this advice by setting config variable advice.detachedHead to false

Filtering content: 100% (27/27), 30.46 MiB | 935.00 KiB/s, done.
=======================================================
Dataset (hab_fetch) successfully downloaded.
Source: '/root/habitat-lab/data/versioned_data/hab_fetch'
Symlink: '/root/habitat-lab/data/robots/hab_fetch'
=======================================================
git clone --depth 1 --branch v1.2 https://huggingface.co/datasets/ai-habitat/ycb.git /root/habitat-lab/data/versioned_data/ycb
Cloning into '/root/habitat-lab/data/versioned_data/ycb'...
remote: Enumerating objects: 480, done.
remote: Counting objects: 100% (480/480), done.
remote: Compressing objects: 100% (326/326), done.
remote: Total 480 (delta 75), reused 479 (delta 75), pack-reused 0
Receiving objects: 100% (480/480), 50.08 KiB | 8.35 MiB/s, done.
Resolving deltas: 100% (75/75), done.
Note: switching to '5cff4a9f080ee49e43b881147159bc4dc1483eb7'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:

  git switch -c <new-branch-name>

Or undo this operation with:

  git switch -

Turn off this advice by setting config variable advice.detachedHead to false

Downloading meshes/065-j_cups/google_16k/textured.glb (627 KB)
Error downloading object: meshes/065-j_cups/google_16k/textured.glb (cd034ea): Smudge error: Error downloading meshes/065-j_cups/google_16k/textured.glb (cd034eaf30e18b86b541195c36eb5fe8b8285a16c769b300c2cfeb77b590abd6): batch response: Post "https://huggingface.co/datasets/ai-habitat/ycb.git/info/lfs/objects/batch": read tcp 172.17.0.2:49306->172.20.0.113:12798: read: connection reset by peer

Errors logged to '/root/habitat-lab/data/versioned_data/ycb/.git/lfs/logs/20240208T115925.887835557.log'.
Use `git lfs logs last` to view the log.
error: external filter 'git-lfs filter-process' failed
fatal: meshes/065-j_cups/google_16k/textured.glb: smudge filter lfs failed
warning: Clone succeeded, but checkout failed.
You can inspect what was checked out with 'git status'
and retry with 'git restore --source=HEAD :/'

Traceback (most recent call last):
  File "/root/habitat-lab/examples/example.py", line 31, in <module>
    example()
  File "/root/habitat-lab/examples/example.py", line 15, in example
    with gym.make("HabitatRenderPick-v0") as env:
  File "/root/miniconda3/envs/habitat/lib/python3.9/site-packages/gym/envs/registration.py", line 676, in make
    return registry.make(id, **kwargs)
  File "/root/miniconda3/envs/habitat/lib/python3.9/site-packages/gym/envs/registration.py", line 520, in make
    return spec.make(**kwargs)
  File "/root/miniconda3/envs/habitat/lib/python3.9/site-packages/gym/envs/registration.py", line 140, in make
    env = cls(**_kwargs)
  File "/root/habitat-lab/habitat-lab/habitat/gym/gym_definitions.py", line 91, in _make_habitat_gym_env
    env = make_gym_from_config(config)
  File "/root/habitat-lab/habitat-lab/habitat/gym/gym_definitions.py", line 60, in make_gym_from_config
    return make_env_fn(env_class=env_class, config=config, dataset=dataset)
  File "/root/habitat-lab/habitat-lab/habitat/utils/env_utils.py", line 36, in make_env_fn
    dataset = make_dataset(config.dataset.type, config=config.dataset)
  File "/root/habitat-lab/habitat-lab/habitat/datasets/registration.py", line 22, in make_dataset
    return _dataset(**kwargs)  # type: ignore
  File "/root/habitat-lab/habitat-lab/habitat/datasets/rearrange/rearrange_dataset.py", line 62, in __init__
    data_downloader.main(
  File "/root/miniconda3/envs/habitat/lib/python3.9/site-packages/habitat_sim-0.3.0-py3.9-linux-x86_64.egg/habitat_sim/utils/datasets_download.py", line 927, in main
    download_and_place(
  File "/root/miniconda3/envs/habitat/lib/python3.9/site-packages/habitat_sim-0.3.0-py3.9-linux-x86_64.egg/habitat_sim/utils/datasets_download.py", line 762, in download_and_place
    clone_repo_source(uid, version_dir, requires_auth, prune_lfs)
  File "/root/miniconda3/envs/habitat/lib/python3.9/site-packages/habitat_sim-0.3.0-py3.9-linux-x86_64.egg/habitat_sim/utils/datasets_download.py", line 585, in clone_repo_source
    subprocess.check_call(split_command)
  File "/root/miniconda3/envs/habitat/lib/python3.9/subprocess.py", line 373, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['git', 'clone', '--depth', '1', '--branch', 'v1.2', 'https://huggingface.co/datasets/ai-habitat/ycb.git', '/root/habitat-lab/data/versioned_data/ycb']' returned non-zero exit status 128.

I'm not very sure about the reason. But there are some additional info:

  1. it's not very convient to use git the checkout the tag

Steps to Reproduce

Steps to reproduce the behavior:

after download all things and install the version with bullet.

  1. python examples/example.py

Please note that without a minimal working example to reproduce the bug, we may not be able to help you.

the original code

Expected behavior

could you solve the download bug? What's more is it possible to use other ways to download the file ?

  1. use the other mirror to download?(from hf mirrors in China)
  2. or use scp to transport the file? maybe it'll be possible to use your habitat in the cluster which has no access to website outside China.

Additional context

Penguin963 commented 3 weeks ago

我也是同样的问题,现在有解决的方案了吗

Inoriros commented 1 week ago

The error probably indicates that the Git Large File Storage (LFS) extension is not installed on your system. Habitat-Lab and its associated datasets use Git LFS to manage large files, which is why it's crucial for your setup.

Following these steps resolved the issue on my system:

  1. Install Git LFS: Open a terminal and execute the following commands to install Git LFS:

    sudo apt update
    sudo apt install git-lfs
    git lfs install
  2. Retry the operation: After installing Git LFS, navigate back to your Habitat-Lab directory and try running the script again. Git LFS should now be able to fetch the necessary large files.

    cd ~/path/to/habitat-lab
    python examples/interactive_play.py --never-end
  3. Check for dataset integrity: If you had attempted to download datasets before installing Git LFS, it's possible that only placeholder files were downloaded. You might need to clear these and retry the dataset setup:

    # Remove potentially corrupted datasets
    rm -rf ~/path/to/habitat-lab/data/versioned_data/*
    
    # Run your script again to fetch the data correctly
    python examples/interactive_play.py --never-end