lithops-cloud / lithops

A multi-cloud framework for big data analytics and embarrassingly parallel jobs, that provides an universal API for building parallel applications in the cloud ☁️🚀
http://lithops.cloud
Apache License 2.0
317 stars 105 forks source link

Lithops service not installed on VM instance #1041

Closed AaronSoria closed 1 year ago

AaronSoria commented 1 year ago

I'm trying to run the following command lithops test -b aws_ec2 -s aws_s3 And I get the following error:

2023-02-07 08:18:23,305 [INFO] config.py:132 -- Lithops v2.7.1 C:\Users\Allen.lithops\config 2023-02-07 08:18:24,308 [INFO] aws_s3.py:59 -- S3 client created - Region: sa-east-1 2023-02-07 08:18:24,393 [INFO] aws_ec2.py:76 -- AWS EC2 client created - Region: us-east-1 2023-02-07 08:18:26,167 [INFO] invokers.py:108 -- ExecutorID 3824ff-0 | JobID A000 - Selected Runtime: python.exe 2023-02-07 08:18:26,867 [INFO] invokers.py:116 -- Runtime python.exe is not yet deployed 2023-02-07 08:18:29,525 [INFO] standalone.py:425 -- Installing Lithops in VM instance lithops-master-cf8a (3.91.187.123) Traceback (most recent call last): File "C:\Program Files\Python310\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Program Files\Python310\lib\runpy.py", line 86, in _run_code exec(code, run_globals) File "Z:\Personal_Projects\pruebas_master\pre_proc_open_cv\env\Scripts\lithops.exe__main__.py", line 7, in File "Z:\Personal_Projects\pruebas_master\pre_proc_open_cv\env\lib\site-packages\click\core.py", line 1130, in call return self.main(args, kwargs) File "Z:\Personal_Projects\pruebas_master\pre_proc_open_cv\env\lib\site-packages\click\core.py", line 1055, in main rv = self.invoke(ctx) File "Z:\Personal_Projects\pruebas_master\pre_proc_open_cv\env\lib\site-packages\click\core.py", line 1657, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "Z:\Personal_Projects\pruebas_master\pre_proc_open_cv\env\lib\site-packages\click\core.py", line 1404, in invoke return ctx.invoke(self.callback, ctx.params) File "Z:\Personal_Projects\pruebas_master\pre_proc_open_cv\env\lib\site-packages\click\core.py", line 760, in invoke return __callback(args, **kwargs) File "Z:\Personal_Projects\pruebas_master\pre_proc_open_cv\env\lib\site-packages\lithops\scripts\cli.py", line 173, in test_function fexec.call_async(hello, username) File "Z:\Personal_Projects\pruebas_master\pre_proc_open_cv\env\lib\site-packages\lithops\executors.py", line 209, in call_async runtime_meta = self.invoker.select_runtime(job_id, runtime_memory) File "Z:\Personal_Projects\pruebas_master\pre_proc_open_cv\env\lib\site-packages\lithops\invokers.py", line 117, in select_runtime runtime_meta = self.compute_handler.deploy_runtime(self.runtime_name, runtime_memory, runtime_timeout) File "Z:\Personal_Projects\pruebas_master\pre_proc_open_cv\env\lib\site-packages\lithops\standalone\standalone.py", line 340, in deploy_runtime self._wait_master_service_ready() File "Z:\Personal_Projects\pruebas_master\pre_proc_open_cv\env\lib\site-packages\lithops\standalone\standalone.py", line 136, in _wait_master_service_ready self._validate_master_service_setup() File "Z:\Personal_Projects\pruebas_master\pre_proc_open_cv\env\lib\site-packages\lithops\standalone\standalone.py", line 107, in _validate_master_service_setup raise LithopsValidationError( lithops.standalone.standalone.LithopsValidationError: Lithops service not installed on VM instance lithops-master-cf8a (3.91.187.123), consider using 'lithops clean' to delete runtime metadata or 'lithops clean --all' to delete master instance as well

JosepSampe commented 1 year ago

@AaronSoria I guess you are using an Ubuntu image, right? Could you try using python as a runtime name instead of python.exe ?

AaronSoria commented 1 year ago

Yes! now I have another exception 2023-02-07 13:50:30,845 [INFO] config.py:134 -- Lithops v2.8.0 2023-02-07 13:50:31,085 [INFO] aws_s3.py:60 -- S3 client created - Region: sa-east-1 2023-02-07 13:50:31,151 [INFO] aws_ec2.py:77 -- AWS EC2 client created - Region: us-east-1 2023-02-07 13:50:32,833 [INFO] invokers.py:108 -- ExecutorID e89452-0 | JobID A000 - Selected Runtime: python 2023-02-07 13:50:33,502 [INFO] invokers.py:116 -- Runtime python is not yet deployed 2023-02-07 13:50:35,680 [INFO] standalone.py:425 -- Installing Lithops in VM instance lithops-master-cf8a (54.160.103.214) 2023-02-07 13:50:43,805 [INFO] standalone.py:138 -- Waiting Lithops service to become ready on VM instance lithops-master-cf8a (54.160.103.214) 2023-02-07 13:50:46,232 [INFO] invokers.py:172 -- ExecutorID e89452-0 | JobID A000 - Starting function invocation: hello() - Total: 1 activations Traceback (most recent call last): File "/usr/local/bin/lithops", line 8, in sys.exit(lithops_cli()) File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1130, in call return self.main(args, kwargs) File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1055, in main rv = self.invoke(ctx) File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1657, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1404, in invoke return ctx.invoke(self.callback, ctx.params) File "/usr/local/lib/python3.10/site-packages/click/core.py", line 760, in invoke return __callback(args, **kwargs) File "/usr/local/lib/python3.10/site-packages/lithops/scripts/cli.py", line 176, in test_function fexec.call_async(hello, username) File "/usr/local/lib/python3.10/site-packages/lithops/executors.py", line 224, in call_async futures = self.invoker.run_job(job) File "/usr/local/lib/python3.10/site-packages/lithops/invokers.py", line 266, in run_job futures = self._run_job(job) File "/usr/local/lib/python3.10/site-packages/lithops/invokers.py", line 205, in _run_job raise e File "/usr/local/lib/python3.10/site-packages/lithops/invokers.py", line 202, in _run_job self._invoke_job(job) File "/usr/local/lib/python3.10/site-packages/lithops/invokers.py", line 249, in _invoke_job activation_id = self.compute_handler.invoke(payload) File "/usr/local/lib/python3.10/site-packages/lithops/standalone/standalone.py", line 288, in invoke raise Exception('It was not possible to create any worker') Exception: It was not possible to create any worker

JosepSampe commented 1 year ago

Can you access the VM with lithops attach -b aws_ec2 --start and send the log from here /tmp/lithops/service.log?

Also run the command with the debug flag: lithops test -b aws_ec2 -s aws_s3 -d

AaronSoria commented 1 year ago

There is no log in /tmp/lithops/service.log but I got this from lithops test -b aws_ec2 -s aws_s3 -d 2023-02-07 14:17:50,129 [INFO] config.py:134 -- Lithops v2.8.0 2023-02-07 14:17:50,129 [DEBUG] config.py:93 -- Loading configuration from /home/parallel/.lithops_config 2023-02-07 14:17:50,151 [DEBUG] config.py:211 -- Loading Standalone backend module: aws_ec2 2023-02-07 14:17:50,298 [DEBUG] config.py:251 -- Loading Storage backend module: aws_s3 2023-02-07 14:17:50,312 [DEBUG] aws_s3.py:33 -- Creating S3 client 2023-02-07 14:17:50,479 [INFO] aws_s3.py:60 -- S3 client created - Region: sa-east-1 2023-02-07 14:17:50,479 [DEBUG] aws_ec2.py:51 -- Creating AWS EC2 client 2023-02-07 14:17:50,599 [INFO] aws_ec2.py:77 -- AWS EC2 client created - Region: us-east-1 2023-02-07 14:17:50,599 [DEBUG] standalone.py:58 -- Standalone handler created successfully 2023-02-07 14:17:50,600 [DEBUG] invokers.py:94 -- ExecutorID bd83bb-0 - Invoker initialized. Max workers: 100 2023-02-07 14:17:50,601 [DEBUG] aws_ec2.py:89 -- Initializing AWS EC2 backend (reuse mode) 2023-02-07 14:17:51,943 [DEBUG] aws_ec2.py:149 -- Requesting current spot price for worker VMs of type t2.medium 2023-02-07 14:17:52,242 [DEBUG] aws_ec2.py:158 -- Current spot instance price for t2.medium is $0.015800 2023-02-07 14:17:52,242 [DEBUG] executors.py:164 -- Function executor for aws_ec2 created with ID: bd83bb-0 2023-02-07 14:17:52,242 [INFO] invokers.py:108 -- ExecutorID bd83bb-0 | JobID A000 - Selected Runtime: python 2023-02-07 14:17:52,242 [DEBUG] storage.py:415 -- Runtime metadata found in local disk cache 2023-02-07 14:17:52,243 [DEBUG] job.py:234 -- ExecutorID bd83bb-0 | JobID A000 - Serializing function and data 2023-02-07 14:17:52,244 [DEBUG] module_dependency.py:66 -- Queuing module 'lithops.scripts.cli' 2023-02-07 14:17:52,244 [DEBUG] module_dependency.py:110 -- Module 'lithops' is to be ignored, skipping 2023-02-07 14:17:52,245 [DEBUG] serialize.py:81 -- Referenced modules: /usr/local/lib/python3.10/site-packages/lithops/scripts/cli.py 2023-02-07 14:17:52,245 [DEBUG] serialize.py:98 -- Modules to transmit: None 2023-02-07 14:17:52,245 [DEBUG] job.py:268 -- ExecutorID bd83bb-0 | JobID A000 - Uploading function and modules to the storage backend 2023-02-07 14:17:52,977 [DEBUG] aws_s3.py:81 -- PUT Object lithops.jobs/bd83bb-0/377c69cfe3dd94f5f04be3a9551f7a35.func.pickle - Size: 693.0B - OK 2023-02-07 14:17:52,977 [DEBUG] job.py:294 -- ExecutorID bd83bb-0 | JobID A000 - Uploading data to the storage backend 2023-02-07 14:17:53,124 [DEBUG] aws_s3.py:81 -- PUT Object lithops.jobs/bd83bb-0-A000/aggdata.pickle - Size: 29.0B - OK 2023-02-07 14:17:53,125 [INFO] invokers.py:172 -- ExecutorID bd83bb-0 | JobID A000 - Starting function invocation: hello() - Total: 1 activations 2023-02-07 14:17:53,125 [DEBUG] invokers.py:177 -- ExecutorID bd83bb-0 | JobID A000 - Worker processes: 2 - Chunksize: 22023-02-07 14:17:54,348 [DEBUG] ssh_client.py:40 -- 52.86.26.148 ssh client created 2023-02-07 14:17:55,363 [DEBUG] standalone.py:278 -- Found 0 free workers connected to master VM instance lithops-master-cf8a (52.86.26.148) 2023-02-07 14:17:55,363 [DEBUG] standalone.py:283 -- Going to create 1 new workers 2023-02-07 14:17:55,982 [INFO] aws_ec2.py:559 -- Starting VM instance lithops-master-cf8a 2023-02-07 14:17:55,993 [DEBUG] aws_ec2.py:410 -- Creating new VM instance lithops-worker-bd83bb-0-a000-0000 (Spot) 2023-02-07 14:17:56,509 [DEBUG] aws_ec2.py:570 -- VM instance lithops-master-cf8a started successfully 2023-02-07 14:17:57,577 [DEBUG] standalone.py:261 -- Total worker VM instances created: 0/1 Traceback (most recent call last): File "/usr/local/bin/lithops", line 8, in sys.exit(lithops_cli()) File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1130, in call return self.main(args, kwargs) File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1055, in main rv = self.invoke(ctx) File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1657, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1404, in invoke return ctx.invoke(self.callback, ctx.params) File "/usr/local/lib/python3.10/site-packages/click/core.py", line 760, in invoke return __callback(args, **kwargs) File "/usr/local/lib/python3.10/site-packages/lithops/scripts/cli.py", line 176, in test_function fexec.call_async(hello, username) File "/usr/local/lib/python3.10/site-packages/lithops/executors.py", line 224, in call_async futures = self.invoker.run_job(job) File "/usr/local/lib/python3.10/site-packages/lithops/invokers.py", line 266, in run_job futures = self._run_job(job) File "/usr/local/lib/python3.10/site-packages/lithops/invokers.py", line 205, in _run_job raise e File "/usr/local/lib/python3.10/site-packages/lithops/invokers.py", line 202, in _run_job self._invoke_job(job) File "/usr/local/lib/python3.10/site-packages/lithops/invokers.py", line 249, in _invoke_job activation_id = self.compute_handler.invoke(payload) File "/usr/local/lib/python3.10/site-packages/lithops/standalone/standalone.py", line 288, in invoke raise Exception('It was not possible to create any worker') Exception: It was not possible to create any worker

JosepSampe commented 1 year ago

Run lithops clean -b aws_ec2 -s aws_s3, then lithops test -b aws_ec2 -s aws_s3 -d, and then check for the logs in the VM (/tmp/lithops/service.log) before it stops

AaronSoria commented 1 year ago

this is the log service.log

JosepSampe commented 1 year ago

Everything is installed successfully in the master VM. The problem is the worker.

I think that currently AWS does not have SPOT instance capacity (Which is the default for Lithops). Can you set in your config file:

aws_ec2:
    request_spot_instances: false
AaronSoria commented 1 year ago

Great!! this is the solution