errors while evaluating it on split-CIFAR10

hiteshvaidya commented 11 months ago

I am trying to evaluate your code on cifar-10 and for that I created all the required .yaml files in /benchmark, /experiment folders using the values obtained from files specific to cifar100. I initialized n_experiences: 2 in split_cifar10.yaml that I created in /benchmark folder.

I am getting the following error,

Error executing job with overrides: ['strategy=er', 'experiment=split_cifar10', 'strategy.train_epochs=1']
Traceback (most recent call last):
  File "/data/hvaidya/ocl_survey/experiments/main.py", line 127, in <module>
    main()
  File "/home/h/hvaidya/.conda/envs/ocl_survey/lib/python3.10/site-packages/hydra/main.py", line 94, in decorated_main
    _run_hydra(
  File "/home/h/hvaidya/.conda/envs/ocl_survey/lib/python3.10/site-packages/hydra/_internal/utils.py", line 394, in _run_hydra
    _run_app(
  File "/home/h/hvaidya/.conda/envs/ocl_survey/lib/python3.10/site-packages/hydra/_internal/utils.py", line 457, in _run_app
    run_and_report(
  File "/home/h/hvaidya/.conda/envs/ocl_survey/lib/python3.10/site-packages/hydra/_internal/utils.py", line 223, in run_and_report
    raise ex
  File "/home/h/hvaidya/.conda/envs/ocl_survey/lib/python3.10/site-packages/hydra/_internal/utils.py", line 220, in run_and_report
    return func()
  File "/home/h/hvaidya/.conda/envs/ocl_survey/lib/python3.10/site-packages/hydra/_internal/utils.py", line 458, in <lambda>
    lambda: hydra.run(
  File "/home/h/hvaidya/.conda/envs/ocl_survey/lib/python3.10/site-packages/hydra/_internal/hydra.py", line 132, in run
    _ = ret.return_value
  File "/home/h/hvaidya/.conda/envs/ocl_survey/lib/python3.10/site-packages/hydra/core/utils.py", line 260, in return_value
    raise self._return_value
  File "/home/h/hvaidya/.conda/envs/ocl_survey/lib/python3.10/site-packages/hydra/core/utils.py", line 186, in run_job
    ret.return_value = task_function(task_cfg)
  File "/data/hvaidya/ocl_survey/experiments/main.py", line 82, in main
    strategy = method_factory.create_strategy(
  File "/data/hvaidya/ocl_survey/src/factories/method_factory.py", line 319, in create_strategy
    cl_strategy = [strategy](**strategy_dict, plugins=plugins)
TypeError: 'list' object is not callable

Upon checking the values of model and optimizer I found that they are initialized as per the config files. The error is thrown at line 295 cl_strategy = globals()[strategy](**strategy_dict, plugins=plugins) in method_factory.py in src/factories folder.

Please let me know if you need the .yaml files that I created for split-cifar10.

Thank you!

AlbinSou commented 11 months ago

Given the error that you shared, I think that you might have erased one word in method factory which could cause the error that you get. Here, you call [strategy]() which indeed is not possible since list object is not callable. The original code calls globals()[strategy]() which is a function and is callable.

hiteshvaidya commented 11 months ago

I do have globals() in the code and I still get the following error

Traceback (most recent call last):
  File "/data/hvaidya/ocl_survey/experiments/main.py", line 127, in <module>
    main()
  File "/home/h/hvaidya/.conda/envs/ocl_survey/lib/python3.10/site-packages/hydra/main.py", line 94, in decorated_main
    _run_hydra(
  File "/home/h/hvaidya/.conda/envs/ocl_survey/lib/python3.10/site-packages/hydra/_internal/utils.py", line 394, in _run_hydra
    _run_app(
  File "/home/h/hvaidya/.conda/envs/ocl_survey/lib/python3.10/site-packages/hydra/_internal/utils.py", line 457, in _run_app
    run_and_report(
  File "/home/h/hvaidya/.conda/envs/ocl_survey/lib/python3.10/site-packages/hydra/_internal/utils.py", line 223, in run_and_report
    raise ex
  File "/home/h/hvaidya/.conda/envs/ocl_survey/lib/python3.10/site-packages/hydra/_internal/utils.py", line 220, in run_and_report
    return func()
  File "/home/h/hvaidya/.conda/envs/ocl_survey/lib/python3.10/site-packages/hydra/_internal/utils.py", line 458, in <lambda>
    lambda: hydra.run(
  File "/home/h/hvaidya/.conda/envs/ocl_survey/lib/python3.10/site-packages/hydra/_internal/hydra.py", line 132, in run
    _ = ret.return_value
  File "/home/h/hvaidya/.conda/envs/ocl_survey/lib/python3.10/site-packages/hydra/core/utils.py", line 260, in return_value
    raise self._return_value
  File "/home/h/hvaidya/.conda/envs/ocl_survey/lib/python3.10/site-packages/hydra/core/utils.py", line 186, in run_job
    ret.return_value = task_function(task_cfg)
  File "/data/hvaidya/ocl_survey/experiments/main.py", line 82, in main
    strategy = method_factory.create_strategy(
  File "/data/hvaidya/ocl_survey/src/factories/method_factory.py", line 318, in create_strategy
    cl_strategy = globals()[strategy](**strategy_dict, plugins=plugins)
  File "/data/hvaidya/ocl_survey/src/strategies/icarl.py", line 191, in __init__
    super().__init__(
  File "/home/h/hvaidya/.conda/envs/ocl_survey/lib/python3.10/site-packages/avalanche/training/templates/common_templates.py", line 122, in __init__
    super().__init__()  # type: ignore
  File "/home/h/hvaidya/.conda/envs/ocl_survey/lib/python3.10/site-packages/avalanche/training/templates/observation_type/batch_observation.py", line 16, in __init__
    super().__init__()
TypeError: BaseSGDTemplate.__init__() missing 2 required positional arguments: 'model' and 'optimizer'

AlbinSou commented 11 months ago

Can you confirm to me that the code does not work only with cifar10 config and still works with cifar100 config ? This error is the same one than the first issue you posted and I think it's more a environment error (typing extensions package was in cause last time). Have you installed new packages since then that could have updated typing extensions version maybe ?

hiteshvaidya commented 11 months ago

I recloned the repo and tested on split-CIFAR100 and I still get #5

AlbinSou commented 11 months ago

In that case, I think something has changed in your environment, did you check that the typing-extensions version is 4.4.0 ? I added that in the requirements after #5 so it should be fixed. I would be interested to know if this issue can be caused by other factors.

hiteshvaidya commented 11 months ago

Yes, I noticed the update of version in the requirements file. Except, it needs == instead of =

My error got resolved. I was trying to run the experiments on a compute cluster that uses slurm and ther were issues with the shell script which led to the errors. Thank you very much! for your prompt responses

AlbinSou / ocl_survey

errors while evaluating it on split-CIFAR10 #7