optimas-org / optimas

Optimization at scale, powered by libEnsemble
https://optimas.readthedocs.io
Other
22 stars 13 forks source link

optimas_Error #166

Closed mehdiabedi1234 closed 4 months ago

mehdiabedi1234 commented 7 months ago

Dear Sir: Greetings, I recently ran an example Optimas file: run_example.py that was on the website, I have the below error, how can I fix it? (base) C:\Mehdi\D\optimas>python run_example.py sethostname: Use the Network Control Panel Applet to set hostname. hostname -s is not supported. [INFO 01-22 21:56:32] optimas.generators.base: Generated trial 0 with parameters {'laser_scale': 0.766370104253292, 'z_foc': 3.1182957272976637, 'mult': 1.1702657341957092, 'plasma_scale': 0.7395065188407899} [INFO 01-22 21:56:32] optimas.generators.base: Generated trial 1 with parameters {'laser_scale': 0.9062855280935764, 'z_foc': 3.4193605296313763, 'mult': 1.1150514364242554, 'plasma_scale': 0.7221612745895982} [INFO 01-22 21:56:32] optimas.generators.base: Generated trial 2 with parameters {'laser_scale': 0.8916185235604643, 'z_foc': 7.492509440984577, 'mult': 0.9415011694654821, 'plasma_scale': 0.7891588853672147} [INFO 01-22 21:56:32] optimas.generators.base: Generated trial 3 with parameters {'laser_scale': 0.9056016370654106, 'z_foc': 3.220312344375998, 'mult': 0.9464270217344164, 'plasma_scale': 0.6241758342832326} [INFO 01-22 21:56:32] optimas.generators.base: Generated trial 4 with parameters {'laser_scale': 0.9636959314811975, 'z_foc': 7.169589727185667, 'mult': 1.1150858242064714, 'plasma_scale': 0.6309504834935069} [INFO 01-22 21:56:33] optimas.generators.base: Generated trial 5 with parameters {'laser_scale': 0.7418236290104687, 'z_foc': 6.4047843888401985, 'mult': 0.6026117544621229, 'plasma_scale': 0.6573561508208513} [0] 2024-01-22 21:56:33,021 libensemble.manager (ERROR): Traceback (most recent call last): File "C:\Users\mabed\miniconda3\Lib\site-packages\libensemble\tools\alloc_support.py", line 408, in _convert_to_rsets num_rsets_req = num_units // units_per_rset + (num_units % units_per_rset > 0)


ZeroDivisionError: integer division or modulo by zero

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\mabed\miniconda3\Lib\site-packages\libensemble\manager.py", line 649, in run
    Work, persis_info, flag = self._alloc_work(self.hist.trim_H(), persis_info)
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\mabed\miniconda3\Lib\site-packages\libensemble\manager.py", line 614, in _alloc_work
    output = alloc_f(
             ^^^^^^^^
  File "C:\Users\mabed\miniconda3\Lib\site-packages\libensemble\alloc_funcs\start_only_persistent.py", line 100, in only_persistent_gens
    Work[wid] = support.sim_work(wid, H, sim_specs["in"], sim_ids_to_send, persis_info.get(wid))
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\mabed\miniconda3\Lib\site-packages\libensemble\tools\alloc_support.py", line 238, in sim_work
    self._update_rset_team(libE_info, wid, H=H, H_rows=H_rows)
  File "C:\Users\mabed\miniconda3\Lib\site-packages\libensemble\tools\alloc_support.py", line 212, in _update_rset_team
    num_rsets_req, use_gpus = self._req_resources_sim(libE_info, user_params, H, H_rows)
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\mabed\miniconda3\Lib\site-packages\libensemble\tools\alloc_support.py", line 168, in _req_resources_sim
    num_rsets_req_for_gpus = AllocSupport._convert_rows_to_rsets(
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\mabed\miniconda3\Lib\site-packages\libensemble\tools\alloc_support.py", line 399, in _convert_rows_to_rsets
    num_rsets_req = AllocSupport._convert_to_rsets(libE_info, user_params, units_per_rset, max_num_units, units_str)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\mabed\miniconda3\Lib\site-packages\libensemble\tools\alloc_support.py", line 410, in _convert_to_rsets
    raise InsufficientResourcesError(
libensemble.resources.scheduler.InsufficientResourcesError: There are zero num_gpus per resource set (worker). Use fewer workers or more resources
AngelFP commented 7 months ago

Hi @mehdiabedi1234, It looks like your example script is requesting more GPUs than there are available in your computer. What is the value of n_gpus in the TemplateEvaluator and the number of sim_workers?

mehdiabedi1234 commented 7 months ago

Thank you for your quick response. I did not change the Python code. I just ran the run-example.py file. It was n_gpus=1 and sim_workers=4. How long does this example take to run with a computer with 32 RAM and Corei 7? Sincerely yours,

AngelFP commented 7 months ago

So, n_gpus=1 means that each simulation will request one GPU. Since you have 4 simulation workers, optimas will run 4 simulations in parallel. This means that you need a computer with 4 GPUs to be able to run this example. You can of course change that. How many GPUs does your system have?

Also, I'm not sure which example is this, since we have several that use GPUs. Can you point me to it?

mehdiabedi1234 commented 7 months ago

I used Optimas examples, optimization with FBPIC, as follows, ├── run_example.py ├── template_simulation_script.py └── analysis_script.py I understood my laptop dose not use GPU, because my graphics card is Intel® Iris® Xe Graphics, I searched, seemingly CUDA dose not support it. Do you have any solution for this issue? Sincerely yours, Mehdi

AngelFP commented 7 months ago

Ok, thanks. As you say, this example runs FBPIC simulations, which typically use a GPU (NVIDIA only). Since you don't have one, you should set n_gpu=0, which should force FBPIC to run on the CPU. However, this will be much slower than on a GPU.

This example is better suited to run on an HPC cluster, because FBPIC simulations are typically too expensive to run on a normal PC.

mehdiabedi1234 commented 7 months ago

Thank you for your guidance. Best regards, Mehdi

mehdiabedi1234 commented 6 months ago

Dear Ms/Sir: Greetings, I used Optimas examples, Multitask optimization with FBPIC and Wake-T , below link: https://optimas.readthedocs.io/en/latest/examples/bo_multitask_fbpic_waket.html#multitask-optimization-with-fbpic-and-wake-t

I got these errors when I ran run_opt.py as follows,

Traceback (most recent call last): File "C:\Mi\E\wake-T\Bayesian\only_wake_t\run_opt.py", line 79, in exp.run() File "C:\Users\mabed\anaconda3\envs\mi_env\Lib\site-packages\optimas\explorations\base.py", line 189, in run history, persis_info, flag = libE( ^^^^^ File "C:\Users\mabed\anaconda3\envs\mi_env\Lib\site-packages\libensemble\libE.py", line 221, in libE ensemble = _EnsembleSpecs( ^^^^^^^^^^^^^^^ File "pydantic\main.py", line 341, in pydantic.main.BaseModel.init pydantic.error_wrappers.ValidationError: 1 validation error for _EnsembleSpecs libE_specs -> sim_dir_copy_files 'C:\Mi\E\wake-T\Bayesian\only_wake_t\custom_ptcl_diags.py' in Value does not refer to an existing path. (type=assertion_error)

How can I solve these errors? thank you for your help. Sincerely yours,

AngelFP commented 6 months ago

It looks like you are missing some files from the example (in particular the custom_ptcl_diags.py file). Make sure you download the 6 files listed here https://optimas.readthedocs.io/en/latest/examples/bo_multitask_fbpic_waket.html#scripts

mehdiabedi1234 commented 6 months ago

Thank you sir for your useful comments. Best regards,