Closed yairchn closed 9 months ago
Try running βhelp. Thes args are more flexible now: https://github.com/NVIDIA/earth2mip/blob/a17fd31ae15b83a052c57c88eb30a153d2995415/earth2mip/_cli_utils.py#L25
From: Yair Cohen @.> Date: Wednesday, December 6, 2023 at 2:29 PM To: NVIDIA/earth2mip @.> Cc: Noah Brenowitz @.>, Mention @.> Subject: [NVIDIA/earth2mip] π[BUG]: unrecognized input to lagged ensembles (Issue #143) Version
source - main
On which installation method(s) does this occur?
Pip
Describe the issue
following the instructions in lagged ensembles main:
torchrun --nproc_per_node 2 --nnodes 1 -m earth2mip.lagged_ensembles --model sfno_73ch --inits 10 --leads 5 --lags 4
produces the following error:
usage: Run a lagged ensemble scoring
Can be run against either a fcn model (--model), a forecast directory as
output by earth2mip.time_collection (--forecast_dir), persistence forecast
(--persistence), or deterministic IFS (--ifs).
Saves data as csv files (1 per rank).
Examples:
torchrun --nproc_per_node 2 --nnodes 1 -m earth2mip.lagged_ensembles --model sfno_73ch --inits 10 --leads 5 --lags 4
main.py: error: unrecognized arguments: --inits 10
usage: Run a lagged ensemble scoring
Can be run against either a fcn model (--model), a forecast directory as
output by earth2mip.time_collection (--forecast_dir), persistence forecast
(--persistence), or deterministic IFS (--ifs).
Saves data as csv files (1 per rank).
Examples:
torchrun --nproc_per_node 2 --nnodes 1 -m earth2mip.lagged_ensembles --model sfno_73ch --inits 10 --leads 5 --lags 4
main.py: error: unrecognized arguments: --inits 10
[2023-12-06 14:26:24,496] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 2) local_rank: 0 (pid: 922229) of binary: /usr/bin/python
Traceback (most recent call last):
File "/usr/local/bin/torchrun", line 8, in
sys.exit(main())
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 346, in wrapper
return f(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/run.py", line 806, in main
run(args)
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/run.py", line 797, in run
elastic_launch(
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/launcher/api.py", line 134, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/launcher/api.py", line 264, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
it seem that inits is no longer an argument in parse_args - I think @nbren12https://github.com/nbren12 might have decided to make it a fix number rather than an input by users choice.
Environment details
running on Selene interactive session with gitlab-master.nvidia.com/earth-2/fcn-mip:latest
β Reply to this email directly, view it on GitHubhttps://github.com/NVIDIA/earth2mip/issues/143, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAKSREVMIGXJWWN3IISXY7TYIDWVNAVCNFSM6AAAAABAKDNRLGVHI2DSMVQWIX3LMV43ASLTON2WKOZSGAZDSNBZHA3TOMA. You are receiving this because you were mentioned.Message ID: @.***>
Closing since the --inits
flag is replaced by --start-time
and --end-time
.
Version
source - main
On which installation method(s) does this occur?
Pip
Describe the issue
following the instructions in lagged ensembles main:
produces the following error:
it seem that inits is no longer an argument in
parse_args
- I think @nbren12 might have decided to make it a fix number rather than an input by users choice.Environment details