Closed FFFiend closed 1 year ago
Can you give me your omegaconf
and hydra
versions, and if they are older than the ones below, update them to at least the versions I have?
Python 3.10.10 (main, Mar 21 2023, 18:45:11) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import omegaconf
>>> omegaconf.__version__
'2.3.0'
>>> import hydra
>>> hydra.__version__
'1.2.0'
Yep I have those exact versions.
Hm, I'm unable to reproduce this error. I created a conda environment and cloned the repository from scratch and things worked fine. Specifically:
$ conda create --name gist-test python=3.10
...
$ conda activate gist-test
$ git clone https://github.com/jayelm/gisting
...
$ cd gisting
$ mkdir exp .cache
$ pip install -r requirements.txt
...
Successfully installed GitPython-3.1.31 MarkupSafe-2.1.3 absl-py-1.4.0 accelerate-0.18.0 aiohttp-3.8.4 aiosignal-1.3.1 antlr4-python3-runtime-4.9.3 async-timeout-4.0.2 attrs-23.1.0 certifi-2023.5.7 charset-normalizer-3.1.0 click-8.1.3 cmake-3.26.3 datasets-2.10.0 deepspeed-0.8.3 dill-0.3.6 docker-pycreds-0.4.0 evaluate-0.3.0 filelock-3.12.0 fire-0.5.0 frozenlist-1.3.3 fsspec-2023.5.0 gitdb-4.0.10 hjson-3.1.0 huggingface-hub-0.15.1 hydra-core-1.2.0 idna-3.4 jinja2-3.1.2 joblib-1.2.0 lit-16.0.5.post0 mpmath-1.3.0 multidict-6.0.4 multiprocess-0.70.14 networkx-3.1 ninja-1.11.1 nltk-3.6.2 numpy-1.21.2 nvidia-cublas-cu11-11.10.3.66 nvidia-cuda-cupti-cu11-11.7.101 nvidia-cuda-nvrtc-cu11-11.7.99 nvidia-cuda-runtime-cu11-11.7.99 nvidia-cudnn-cu11-8.5.0.96 nvidia-cufft-cu11-10.9.0.58 nvidia-curand-cu11-10.2.10.91 nvidia-cusolver-cu11-11.4.0.1 nvidia-cusparse-cu11-11.7.4.91 nvidia-nccl-cu11-2.14.3 nvidia-nvtx-cu11-11.7.91 omegaconf-2.3.0 openai-0.27.2 packaging-23.1 pandas-2.0.2 pathtools-0.1.2 promise-2.3 protobuf-4.23.2 psutil-5.9.5 py-cpuinfo-9.0.0 pyarrow-12.0.0 pydantic-1.10.8 python-dateutil-2.8.2 pytz-2023.3 pyyaml-6.0 regex-2023.6.3 requests-2.31.0 responses-0.18.0 rouge_score-0.1.2 sentencepiece-0.1.98 sentry-sdk-1.25.0 setproctitle-1.3.2 shortuuid-1.0.11 six-1.16.0 smmap-5.0.0 sympy-1.12 termcolor-2.3.0 tokenizers-0.13.3 torch-2.0.0 tqdm-4.65.0 transformers-4.28.0.dev0 triton-2.0.0 typing-extensions-4.6.3 tzdata-2023.3 urllib3-2.0.2 wandb-0.13.4 xxhash-3.2.0 yarl-1.9.2
and then
python -m src.train training.gist.num_gist_tokens=2 training.gist.condition=gist wandb.tag=yourtaghere
works fine and starts training.
You might double check the versions listed above ^ and whether there are any mismatches. The error seems to be an omegaconf error so I'm still a bit suspicious there's a version mismatch somewhere.
Yep I have those exact versions.
I meet the same error with you, have you solved it yet?
Hm, I'm unable to reproduce this error. I created a conda environment and cloned the repository from scratch and things worked fine. Specifically:
$ conda create --name gist-test python=3.10 ... $ conda activate gist-test $ git clone https://github.com/jayelm/gisting ... $ cd gisting $ mkdir exp .cache $ pip install -r requirements.txt ... Successfully installed GitPython-3.1.31 MarkupSafe-2.1.3 absl-py-1.4.0 accelerate-0.18.0 aiohttp-3.8.4 aiosignal-1.3.1 antlr4-python3-runtime-4.9.3 async-timeout-4.0.2 attrs-23.1.0 certifi-2023.5.7 charset-normalizer-3.1.0 click-8.1.3 cmake-3.26.3 datasets-2.10.0 deepspeed-0.8.3 dill-0.3.6 docker-pycreds-0.4.0 evaluate-0.3.0 filelock-3.12.0 fire-0.5.0 frozenlist-1.3.3 fsspec-2023.5.0 gitdb-4.0.10 hjson-3.1.0 huggingface-hub-0.15.1 hydra-core-1.2.0 idna-3.4 jinja2-3.1.2 joblib-1.2.0 lit-16.0.5.post0 mpmath-1.3.0 multidict-6.0.4 multiprocess-0.70.14 networkx-3.1 ninja-1.11.1 nltk-3.6.2 numpy-1.21.2 nvidia-cublas-cu11-11.10.3.66 nvidia-cuda-cupti-cu11-11.7.101 nvidia-cuda-nvrtc-cu11-11.7.99 nvidia-cuda-runtime-cu11-11.7.99 nvidia-cudnn-cu11-8.5.0.96 nvidia-cufft-cu11-10.9.0.58 nvidia-curand-cu11-10.2.10.91 nvidia-cusolver-cu11-11.4.0.1 nvidia-cusparse-cu11-11.7.4.91 nvidia-nccl-cu11-2.14.3 nvidia-nvtx-cu11-11.7.91 omegaconf-2.3.0 openai-0.27.2 packaging-23.1 pandas-2.0.2 pathtools-0.1.2 promise-2.3 protobuf-4.23.2 psutil-5.9.5 py-cpuinfo-9.0.0 pyarrow-12.0.0 pydantic-1.10.8 python-dateutil-2.8.2 pytz-2023.3 pyyaml-6.0 regex-2023.6.3 requests-2.31.0 responses-0.18.0 rouge_score-0.1.2 sentencepiece-0.1.98 sentry-sdk-1.25.0 setproctitle-1.3.2 shortuuid-1.0.11 six-1.16.0 smmap-5.0.0 sympy-1.12 termcolor-2.3.0 tokenizers-0.13.3 torch-2.0.0 tqdm-4.65.0 transformers-4.28.0.dev0 triton-2.0.0 typing-extensions-4.6.3 tzdata-2023.3 urllib3-2.0.2 wandb-0.13.4 xxhash-3.2.0 yarl-1.9.2
and then
python -m src.train training.gist.num_gist_tokens=2 training.gist.condition=gist wandb.tag=yourtaghere
works fine and starts training.
You might double check the versions listed above ^ and whether there are any mismatches. The error seems to be an omegaconf error so I'm still a bit suspicious there's a version mismatch somewhere.
I meet the same problem, and the packages you mentioned above is exactly version 2.3.0 and 1.2.0 in my enviroment, but I still can't run the code
I meet the same problem, and the packages you mentioned above is exactly version 2.3.0 and 1.2.0 in my enviroment, but I still can't run the code
By this do you mean you created a new conda environment using python 3.10 and the steps outlined in the quoted post (conda create --name gist-test python=3.10
), installed the requirements from requirements.txt
, and still ran into this error?
As a workaround you may be able to set
generation_config: Optional[str] = None
here: https://github.com/jayelm/gisting/blob/main/src/arguments.py#L137-L138
to change the type of generation_config and see if that satisfies the omegaconf typechecker.
I meet the same problem, and the packages you mentioned above is exactly version 2.3.0 and 1.2.0 in my enviroment, but I still can't run the code
By this do you mean you created a new conda environment using python 3.10 and the steps outlined in the quoted post (
conda create --name gist-test python=3.10
), installed the requirements fromrequirements.txt
, and still ran into this error?
yes exactly!
sorry, I'm not sure how to change it? could you please give me a more specific instruction? thanks a lot!
Replace the lines linked above with:
class GistSeq2SeqTrainingArguments(GistTrainingArguments, Seq2SeqTrainingArguments):
generation_config: Optional[str] = None
and let me know if that works.
As a workaround you may be able to set
generation_config: Optional[str] = None
here: https://github.com/jayelm/gisting/blob/main/src/arguments.py#L137-L138
to change the type of generation_config and see if that satisfies the omegaconf typechecker.
do you mean that add "generation_config: Optional[str] = None" into the function you mentioned?(above or substitute the pass line)
and let me know if that works.
ok I replace the line but it doesn't work, another error happened
ok I replace the line but it doesn't work, another error happened
can you be more specific?
the error happened in the 96 lines in the file: /site-packages/hydra/core/override_parser/overrides_parser.py
here's the info:
Exception has occurred: OverrideParseException (note: full exception trace is shown but execution is paused at: _run_module_as_main)
mismatched input '=' expecting
During handling of the above exception, another exception occurred:
File "/data/wupf/anaconda3/envs/gist/lib/python3.9/site-packages/hydra/core/override_parser/overrides_parser.py", line 82, in parse_overrides
parsed = self.parse_rule(override, "override")
File "/data/wupf/anaconda3/envs/gist/lib/python3.9/site-packages/hydra/core/override_parser/overrides_parser.py", line 66, in parse_rule
tree = rule()
File "/data/wupf/anaconda3/envs/gist/lib/python3.9/site-packages/hydra/grammar/gen/OverrideParser.py", line 279, in override
self._errHandler.reportError(self, re)
File "/data/wupf/anaconda3/envs/gist/lib/python3.9/site-packages/antlr4/error/ErrorStrategy.py", line 128, in reportError
self.reportInputMismatch(recognizer, e)
File "/data/wupf/anaconda3/envs/gist/lib/python3.9/site-packages/antlr4/error/ErrorStrategy.py", line 275, in reportInputMismatch
recognizer.notifyErrorListeners(msg, e.offendingToken, e)
File "/data/wupf/anaconda3/envs/gist/lib/python3.9/site-packages/antlr4/Parser.py", line 322, in notifyErrorListeners
listener.syntaxError(self, offendingToken, line, column, msg, e)
File "/data/wupf/anaconda3/envs/gist/lib/python3.9/site-packages/antlr4/error/ErrorListener.py", line 60, in syntaxError
delegate.syntaxError(recognizer, offendingSymbol, line, column, msg, e)
File "/data/wupf/anaconda3/envs/gist/lib/python3.9/site-packages/hydra/core/override_parser/overrides_visitor.py", line 372, in syntaxError
raise HydraException(msg) from e
hydra.errors.HydraException: mismatched input '=' expecting
During handling of the above exception, another exception occurred:
File "/data/wupf/anaconda3/envs/gist/lib/python3.9/site-packages/hydra/core/override_parser/overrides_parser.py", line 96, in parse_overrides
raise OverrideParseException(
File "/data/wupf/anaconda3/envs/gist/lib/python3.9/site-packages/hydra/_internal/config_loader_impl.py", line 233, in _load_configuration_impl
parsed_overrides = parser.parse_overrides(overrides=overrides)
File "/data/wupf/anaconda3/envs/gist/lib/python3.9/site-packages/hydra/_internal/config_loader_impl.py", line 141, in load_configuration
return self._load_configuration_impl(
File "/data/wupf/anaconda3/envs/gist/lib/python3.9/site-packages/hydra/_internal/hydra.py", line 594, in compose_config
cfg = self.config_loader.load_configuration(
File "/data/wupf/anaconda3/envs/gist/lib/python3.9/site-packages/hydra/_internal/hydra.py", line 105, in run
cfg = self.compose_config(
File "/data/wupf/anaconda3/envs/gist/lib/python3.9/site-packages/hydra/_internal/utils.py", line 453, in
Can you give the full command you ran to produce this error? This seems to be a syntax issue with how arguments were specified in the CLI.
Can you give the full command you ran to produce this error? This seems to be a syntax issue with how arguments were specified in the CLI.
thanks for your reply, the syntax bug I mentioned above is happened when I use the debug mode in the Vscode, the configuration in launch.json is wrote as:
{
"name": "gist_official",
"type": "python",
"python": "/data/wupf/anaconda3/envs/gist/bin/python",
"request": "launch",
"module": "src.train",
"console": "integratedTerminal",
"justMyCode": false,
"args": ["training.gist.num_gist_tokens=2 training.gist.condition=gist"],
"env": {"CUDA_VISIBLE_DEVICES": "3"},
"cwd": "/data/wupf/gisting"
},
does splitting up the two args into separate strings:
"args": ["training.gist.num_gist_tokens=2", "training.gist.condition=gist"],
help? Or even just removing the args entirely for now to see if it runs.
Can you give the full command you ran to produce this error? This seems to be a syntax issue with how arguments were specified in the CLI.
then I direct run the command in the shell: python -m src.train training.gist.num_gist_tokens=2 training.gist.condition=gist,and another bug was happened and confused me: Error executing job with overrides: ['training.gist.num_gist_tokens=2', 'training.gist.condition=gist']
Traceback (most recent call last):
File "/data/wupf/gisting/src/train.py", line 61, in main
args: Arguments = global_setup(args)
File "/data/wupf/gisting/src/arguments.py", line 335, in global_setup
args = OmegaConf.to_object(args)
ImportError: Using the Trainer
with PyTorch
requires accelerate>=0.20.1
: Please run pip install transformers[torch]
or pip install accelerate -U
Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
n just removing the args entirely
I'm very confused because the newest version of accelerate is 0.19.0 and I use pip install accelerate -U, the package installed is exactly 0.19.0。。。。
does splitting up the two args into separate strings:
"args": ["training.gist.num_gist_tokens=2", "training.gist.condition=gist"],
help? Or even just removing the args entirely for now to see if it runs.
I do this and it report the bug: "Using the Trainer
with PyTorch
requires accelerate>=0.20.1
: Please run pip install transformers[torch]
or pip install accelerate -U
"
I do this and it report the bug: "Using the Trainer with PyTorch requires accelerate>=0.20.1: Please run pip install transformers[torch] or pip install accelerate -U"
I still suspect there is some sort of version mismatch between the env you're using and the one specified in requirements.txt
, as requirements.txt specifies accelerate version 0.18 and I'm able to run that env with no issues (and this is also likely the cause of the config error)—maybe double check the transformers version you're installing is exactly the commit pinned in requirements.txt?
, as requirements.txt specifies accelerate version 0.18 and I'm able to run that env with no issues (and this is also likely the cause of the config error)—maybe double check the transformers version you're installing is exactly the commit pinned in requirements.txt?
luckly I reinstalled the whole process, and it seemed to fix everything!! thanks a lot for your support, and this line "generation_config: Optional[str] = None" is very necessary!!
I do this and it report the bug: "Using the Trainer with PyTorch requires accelerate>=0.20.1: Please run pip install transformers[torch] or pip install accelerate -U"
I still suspect there is some sort of version mismatch between the env you're using and the one specified in
requirements.txt
, as requirements.txt specifies accelerate version 0.18 and I'm able to run that env with no issues (and this is also likely the cause of the config error)—maybe double check the transformers version you're installing is exactly the commit pinned in requirements.txt?
wonderful work, the intuition of "change the attention to allow the model to learn how to compress the prompt sentence" is simple and useless!!!
I do this and it report the bug: "Using the Trainer with PyTorch requires accelerate>=0.20.1: Please run pip install transformers[torch] or pip install accelerate -U"
I still suspect there is some sort of version mismatch between the env you're using and the one specified in
requirements.txt
, as requirements.txt specifies accelerate version 0.18 and I'm able to run that env with no issues (and this is also likely the cause of the config error)—maybe double check the transformers version you're installing is exactly the commit pinned in requirements.txt?wonderful work, the intuition of "change the attention to allow the model to learn how to compress the prompt sentence" is simple and useless!!!
sorry I mean useful!! not useless。。。
luckly I reinstalled the whole process, and it seemed to fix everything!! thanks a lot for your support, and this line "generation_config: Optional[str] = None" is very necessary!!
Glad to hear it's working, and thanks!!
I think this workaround (generation_config: Optional[str] = None
) should fix OP's issue as well, so I'll close this issue.
The error:
How to reproduce: run the train command in README