ctlab / GADMA

Genetic Algorithm for Demographic Model Analysis
Other
46 stars 14 forks source link

Error when using BO #83

Closed btmartin721 closed 1 year ago

btmartin721 commented 1 year ago

Hello Ekaterina,

I am getting the following error when I try to run GADMA using the moments engine with 5 populations using a custom model.

  File "/scrfs/storage/btm002/home/nmt/4.gadma/.gadma2/lib/python3.9/site-packages/gadma/core/core.py", line 51, in main
    settings_storage, args = arg_parser.get_settings()
  File "/scrfs/storage/btm002/home/nmt/4.gadma/.gadma2/lib/python3.9/site-packages/gadma/cli/arg_parser.py", line 127, in get_settings
    settings_storage = SettingsStorage.from_file(args.params, args.extra)
  File "/scrfs/storage/btm002/home/nmt/4.gadma/.gadma2/lib/python3.9/site-packages/gadma/cli/settings_storage.py", line 1043, in from_file
    return obj.update_from_file(param_file, extra_param_file)
  File "/scrfs/storage/btm002/home/nmt/4.gadma/.gadma2/lib/python3.9/site-packages/gadma/cli/settings_storage.py", line 1031, in update_from_file
    self.__setattr__(attr_name, loaded_dict[key])
  File "/scrfs/storage/btm002/home/nmt/4.gadma/.gadma2/lib/python3.9/site-packages/gadma/cli/settings_storage.py", line 546, in __setattr__
    get_global_optimizer(value)
  File "/scrfs/storage/btm002/home/nmt/4.gadma/.gadma2/lib/python3.9/site-packages/gadma/optimizers/global_optimizer.py", line 313, in get_global_optimizer
    raise ValueError(f"Optimizer '{id}' is not registered")
ValueError: Optimizer 'SMAC_BO_combination' is not registered

I am using the following GADMA version:

gadma --version
GADMA version 2.0.0rc26 by Ekaterina Noskova

Do you know what the issue could be? I have attached my parameters file here.

Thank you for your time.

-Bradley

nmt_gadma_params_K5.txt

btmartin721 commented 1 year ago

Doh! Nevermind. I just realized I had not installed the requirements/bayes_opt.txt requirements for Bayesian Optimization. My mistake.

noscode commented 1 year ago

Dear Bradley,

Yes, you are totally right, first, you need to install the dependencies from requirements/bayes_opt.txt. I hope now everything works!

Best regards, Ekaterina

btmartin721 commented 1 year ago

Yes, it's working now. Thank you for your prompt reply!

I have one additional question about BO. I have the params file set to output models every 100 generations, but after running for 4 days, there still haven't been any models output. Does the Bayesian Optimization algorithm output intermediate models like the genetic algorithm does?

noscode commented 1 year ago

Dear Bradley,

Yes, I would say it should output models the same way as a genetic algorithm. I am sorry if it does not happen, maybe there is a bug. Just to be sure, can you please tell me where are you checking those models? Are there files named like "best_logLL_model_XXX_code.py" (XXX can be e.g. dadi, moments or momi) in the output directory? Are there such files in the directory output_dir/1/?

Ekaterina

btmartin721 commented 1 year ago

Ok thank you. I think it's running, but I suspect it hasn't yet reached 100 iterations, which is the frequency I specified to output models. I am using 5 populations with moments, so it's possible it is just taking a really long time. I might need to consider using momi2 instead of moments.

noscode commented 1 year ago

Five populations is an extreme case indeed. You can check the number of iterations in output files that are located in output_dir/N/GADMA_GA.log, where N is the number of run repeat (1,2,3,,,,). There is an output of the Bayesian optimization.

Yes, sure, you can use momi2 engine, however, it does not support continuous migration, and you will need to specify your model using momi2 interface which is quite different to moments.

Using either momi2 or moments, you can first infer the demographic history without migration (option No migrations: True), which will be faster than with migration. After that, if you are still interested in migration, there is a way to get it. You can take the result model without migration, manually add migration and fix other parameters values to the found values. Then find migration parameters only. At the end, I would also recommend running local search for all parameters to be sure in the result.

btmartin721 commented 1 year ago

Hmm. Maybe something is indeed wrong. the output_dir/N/GADMA_GA.log files all just say --Start global optimization SMAC_BO_combination-- and nothing else. It's been running for 96 hours so I would think some other output would appear. Is that correct?

noscode commented 1 year ago

How many samples do you have in your SFS? If it is too big, then it will take a lot of time. For our paper about BO, we tested 5x5x20x20x20 and it was rather tough.

All likelihood evaluations are stored in output_dir/N/eval_file, could you please check if there are some lines or not? If there is nothing, then I guess the issue is that the SFS is too big.

Ekaterina

btmartin721 commented 1 year ago

I have considerably more samples than that. It's around 50-60 per population.

The eval_file exists and there are lines in it. However, it appears that each line (=iteration?) is taking around 17 hours, which is a lot of time. In any case, it appears that it is working correctly, it just has too many samples with 5 populations.

btmartin721 commented 1 year ago

I almost have a momi version of my model working, so I think I will just use momi and also reduce the number of samples.

I appreciate your help!

noscode commented 1 year ago

Yes, sure, it is a lot of samples. Good luck!