lithops-cloud / lithops

A multi-cloud framework for big data analytics and embarrassingly parallel jobs, that provides an universal API for building parallel applications in the cloud ☁️🚀
http://lithops.cloud
Apache License 2.0
317 stars 105 forks source link

Error when running grid search with RandomForestClassifier #1172

Closed foster999 closed 11 months ago

foster999 commented 1 year ago

When using lithops joblib backend, a grid search with the RandomForestClassifier causes an error from joblib. The error complains that the system doesn't support forking, but MacOS does. Running a very close example with either a different classifier or the "loky" (default) joblib backend runs without issue.

Please see sklearn issue for full details - referred here as likely source of issue.

Reproducible example:

import joblib
from lithops.util.joblib import register_lithops
from lithops.utils import setup_lithops_logger
from sklearn.datasets import load_digits
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import GridSearchCV

digits = load_digits()
param_grid = {
    "n_estimators": [100, 50, 25],
}
model = RandomForestClassifier()
search = GridSearchCV(model, param_grid, cv=2, refit=True)

register_lithops()
setup_lithops_logger("INFO")

with joblib.parallel_backend("lithops"):
    search.fit(
        digits.data,
        digits.target,
    )
print("Best score: %0.3f" % search.best_score_)
print("Best parameters set:")
best_parameters = search.best_estimator_.get_params()
print(best_parameters)

Raises this error:


2023-10-16 13:01:43,896 [INFO] config.py:141 -- Lithops v3.0.1 - Python3.11
2023-10-16 13:01:43,896 [INFO] localhost.py:39 -- Localhost storage client created
2023-10-16 13:01:43,896 [INFO] localhost.py:58 -- Localhost compute client created
Traceback (most recent call last):
  File "/engineering/test_lithops.py", line 20, in <module>
    search.fit(
  File "/venv/lib/python3.11/site-packages/sklearn/base.py", line 1152, in wrapper
    return fit_method(estimator, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.11/site-packages/sklearn/model_selection/_search.py", line 898, in fit
    self._run_search(evaluate_candidates)
  File "/venv/lib/python3.11/site-packages/sklearn/model_selection/_search.py", line 1422, in _run_search
    evaluate_candidates(ParameterGrid(self.param_grid))
  File "/venv/lib/python3.11/site-packages/sklearn/model_selection/_search.py", line 875, in evaluate_candidates
    _warn_or_raise_about_fit_failures(out, self.error_score)
  File "/venv/lib/python3.11/site-packages/sklearn/model_selection/_validation.py", line 414, in _warn_or_raise_about_fit_failures
    raise ValueError(all_fits_failed_message)
ValueError: 
All the 6 fits failed.
It is very likely that your model is misconfigured.
You can try to debug the error by setting error_score='raise'.

Below are more details about the failures:
--------------------------------------------------------------------------------
6 fits failed with the following error:
Traceback (most recent call last):
  File "/venv/lib/python3.11/site-packages/sklearn/model_selection/_validation.py", line 729, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "/venv/lib/python3.11/site-packages/sklearn/base.py", line 1152, in wrapper
    return fit_method(estimator, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.11/site-packages/sklearn/ensemble/_forest.py", line 456, in fit
    trees = Parallel(
            ^^^^^^^^^
  File "/venv/lib/python3.11/site-packages/sklearn/utils/parallel.py", line 65, in __call__
    return super().__call__(iterable_with_config)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.11/site-packages/joblib/parallel.py", line 1854, in __call__
    n_jobs = self._initialize_backend()
             ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.11/site-packages/joblib/parallel.py", line 1332, in _initialize_backend
    n_jobs = self._backend.configure(n_jobs=self.n_jobs, parallel=self,
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.11/site-packages/lithops/util/joblib/lithops_backend.py", line 72, in configure
    raise ImportError(
ImportError: [joblib] Attempting to do parallel computing without protecting your import on a system that does not support forking. To use parallel-computing in a script, you must protect your main loop using "if __name__ == '__main__'". Please see the joblib documentation on Parallel for more information
JosepSampe commented 1 year ago

Hi @foster999 . Found that the lithops joblib backend was not compatible anymore with latest versions of joblib. I submitted a newer version of the backend. Can you test it with master branch?

foster999 commented 1 year ago

Thanks @JosepSampe, the example above and more complex examples are working well with master version

foster999 commented 1 year ago

Logging now seems to be much more quieter and doesn't report on the separate processes. With the code above I get a single line: 2023-10-18 14:45:26,200 [INFO] config.py:141 -- Lithops v3.0.2.dev0 - Python3.11

Edit: I can see the other two lines in the output above have been moved to DEBUG. But no information is shown about the separate processes when logger is set to DEBUG either. While previously lines were logged for each process to show their state changes.

JosepSampe commented 1 year ago

Hi @foster999 . Found that the lithops joblib backend was not compatible anymore with latest versions of joblib. I submitted a newer version of the backend. Can you test it with master branch?

Great! thanks for testing it with more complex scenarios

Logging now seems to be much more quieter and doesn't report on the separate processes. With the code above I get a single line: 2023-10-18 14:45:26,200 [INFO] config.py:141 -- Lithops v3.0.2.dev0 - Python3.11

Edit: I can see the other two lines in the output above have been moved to DEBUG. But no information is shown about the separate processes when logger is set to DEBUG either. While previously lines were logged for each process to show their state changes.

Can you provide an example of what you expect? Please note that the actual function execution takes place in the cloud, so locally, you can only see Lithops logs. If you wish to view function logs, you should run 'lithops logs poll' in another terminal.

foster999 commented 1 year ago

I'm currently executing and storing locally, rather than on cloud. I think I was just surprised that fewer logs were produced. That said it doesn't look like any of the missing info does actually relate to the parallel processes - before the change the following logs were produced:

2023-10-18 16:30:06,009 [INFO] config.py:141 -- Lithops v3.0.1 - Python3.11
2023-10-18 16:30:06,009 [DEBUG] config.py:103 -- Loading configuration from .lithops_config
2023-10-18 16:30:06,018 [DEBUG] config.py:176 -- Loading compute backend module: localhost
2023-10-18 16:30:06,018 [DEBUG] config.py:235 -- Loading Storage backend module: localhost
2023-10-18 16:30:06,018 [DEBUG] localhost.py:36 -- Creating Localhost storage client
2023-10-18 16:30:06,018 [INFO] localhost.py:39 -- Localhost storage client created
2023-10-18 16:30:06,018 [DEBUG] localhost.py:49 -- Creating Localhost compute client
2023-10-18 16:30:06,018 [INFO] localhost.py:58 -- Localhost compute client created
2023-10-18 16:30:06,018 [DEBUG] invokers.py:93 -- ExecutorID f7157d-0 - Invoker initialized. Max workers: 1
2023-10-18 16:30:06,018 [DEBUG] executors.py:145 -- Function executor for localhost created with ID: f7157d-0

Now on master branch version:

2023-10-18 16:31:44,256 [INFO] config.py:141 -- Lithops v3.0.2.dev0 - Python3.11
2023-10-18 16:31:44,256 [DEBUG] config.py:103 -- Loading configuration from .lithops_config
2023-10-18 16:31:44,264 [DEBUG] config.py:176 -- Loading compute backend module: localhost
2023-10-18 16:31:44,264 [DEBUG] config.py:235 -- Loading Storage backend module: localhost

I don't think any further change is required, so happy for you to close this. Thanks for the tip on lithops logs poll

JosepSampe commented 1 year ago

Found that cpu_count was not properly calcluated for the localhost backend. I fixed it in #1176 Can you test again with master branch?

foster999 commented 1 year ago

The logs haven't changed from the master version above, so can't see any details about the workers:

2023-10-19 09:07:26,850 [INFO] config.py:141 -- Lithops v3.0.2.dev0 - Python3.11
2023-10-19 09:07:26,850 [DEBUG] config.py:103 -- Loading configuration from .lithops_config
2023-10-19 09:07:26,861 [DEBUG] config.py:176 -- Loading compute backend module: localhost
2023-10-19 09:07:26,861 [DEBUG] config.py:235 -- Loading Storage backend module: localhost
JosepSampe commented 1 year ago

Can you try running the command lithops clean -b localhost, and then running again the sklearn script?

foster999 commented 1 year ago

No change in the sklearn script log output after cleaning I'm afraid. The clean shows:

2023-10-19 10:22:02,171 [INFO] cli.py:85 -- Cleaning all Lithops information
2023-10-19 10:22:02,171 [INFO] config.py:141 -- Lithops v3.0.2.dev0 - Python3.11
2023-10-19 10:22:02,184 [INFO] localhost.py:39 -- Localhost storage client created
2023-10-19 10:22:02,184 [INFO] localhost.py:58 -- Localhost compute client created
2023-10-19 10:22:02,184 [INFO] utils.py:83 -- Deleting objects from bucket 'storage' and prefix 'lithops.runtimes/localhost'
2023-10-19 10:22:03,189 [INFO] utils.py:93 -- Finished deleting objects, total found: 2
2023-10-19 10:22:03,189 [INFO] utils.py:83 -- Deleting objects from bucket 'storage' and prefix 'lithops.jobs'
2023-10-19 10:22:04,198 [INFO] utils.py:93 -- Finished deleting objects, total found: 4
2023-10-19 10:22:04,701 [INFO] cli.py:119 -- All Lithops data cleaned
JosepSampe commented 1 year ago

The clean command only deletes non-necessary lithops temporary data. Sometimes it might be good to run it in the case you see inconsistences.

Can you place this script in tha same directory where you have the sklearn script, run it, and copy/paste all the output here?

from lithops.multiprocessing import Pool, cpu_count
from lithops.utils import setup_lithops_logger

setup_lithops_logger("DEBUG")

def hello(name):
    return 'Hello {}!'.format(name)

if __name__ == '__main__':
    print(cpu_count())
    with Pool() as pool:
        res = pool.apply(hello, ('World', ))
        print(res)
foster999 commented 1 year ago

I get:

2023-10-19 11:16:43,672 [DEBUG] serialize.py:101 -- Modules to transmit: None
2023-10-19 11:16:43,672 [DEBUG] job.py:268 -- ExecutorID 5585cb-0 | JobID A000 - Uploading function and modules to the storage backend
2023-10-19 11:16:43,672 [DEBUG] job.py:294 -- ExecutorID 5585cb-0 | JobID A000 - Uploading data to the storage backend
2023-10-19 11:16:43,672 [INFO] invokers.py:171 -- ExecutorID 5585cb-0 | JobID A000 - Starting function invocation: cloud_process_wrapper() - Total: 1 activations
2023-10-19 11:16:43,672 [DEBUG] invokers.py:176 -- ExecutorID 5585cb-0 | JobID A000 - Worker processes: 10 - Chunksize: 10
2023-10-19 11:16:43,672 [DEBUG] localhost.py:136 -- ExecutorID 5585cb-0 | JobID A000 - Putting job into localhost queue
2023-10-19 11:16:43,672 [DEBUG] localhost.py:72 -- Staring localhost job manager
2023-10-19 11:16:43,673 [DEBUG] invokers.py:252 -- ExecutorID 5585cb-0 | JobID A000 - Job invoked (0.000s) - Activation ID: 5585cb-0-A000
2023-10-19 11:16:43,673 [DEBUG] localhost.py:386 -- ExecutorID 5585cb-0 | JobID A000 - Running 1 activations in the localhost worker
2023-10-19 11:16:43,673 [INFO] invokers.py:207 -- ExecutorID 5585cb-0 | JobID A000 - View execution logs at /private/var/folders/t4/hdpq5bd51tjfws2mq4kj7gqh0000gn/T/lithops-davidfoster/logs/5585cb-0-A000.log
2023-10-19 11:16:43,673 [DEBUG] monitor.py:381 -- ExecutorID 5585cb-0 - Starting Storage job monitor
2023-10-19 11:16:43,673 [INFO] wait.py:98 -- ExecutorID 5585cb-0 - Getting results from 1 function activations
2023-10-19 11:16:43,912 [DEBUG] localhost.py:85 -- ExecutorID 5585cb-0 | JobID A000 - Execution finished
2023-10-19 11:16:43,912 [DEBUG] localhost.py:90 -- Localhost job manager stopped
2023-10-19 11:16:45,682 [DEBUG] monitor.py:127 -- ExecutorID 5585cb-0 - Pending: 0 - Running: 0 - Done: 1
2023-10-19 11:16:45,682 [DEBUG] monitor.py:409 -- ExecutorID 5585cb-0 - Storage job monitor finished
2023-10-19 11:16:45,684 [DEBUG] future.py:270 -- ExecutorID 5585cb-0 | JobID A000 - Got status from call 00000 - Activation ID: ec9320ee8f74 - Time: 0.03 seconds
2023-10-19 11:16:45,684 [DEBUG] future.py:290 -- ExecutorID 5585cb-0 | JobID A000 - Got output from call 00000 - Activation ID: ec9320ee8f74
2023-10-19 11:16:45,684 [INFO] executors.py:593 -- ExecutorID 5585cb-0 - Cleaning temporary data
2023-10-19 11:16:45,690 [DEBUG] executors.py:503 -- ExecutorID 5585cb-0 - Finished getting results
Hello World!
2023-10-19 11:16:45,691 [DEBUG] pool.py:202 -- terminating pool
JosepSampe commented 1 year ago

I thing there are mising logs, you should see all of this:

2023-10-19 12:09:22,862 [INFO] config.py:141 -- Lithops v3.0.2.dev0 - Python3.10
2023-10-19 12:09:22,863 [DEBUG] config.py:103 -- Loading configuration from /home/josep/.lithops/config
2023-10-19 12:09:22,929 [DEBUG] config.py:176 -- Loading compute backend module: localhost
2023-10-19 12:09:22,929 [DEBUG] config.py:235 -- Loading Storage backend module: localhost
8
2023-10-19 12:09:22,934 [INFO] config.py:141 -- Lithops v3.0.2.dev0 - Python3.10
2023-10-19 12:09:22,934 [DEBUG] config.py:103 -- Loading configuration from /home/josep/.lithops/config
2023-10-19 12:09:22,944 [DEBUG] config.py:176 -- Loading compute backend module: localhost
2023-10-19 12:09:22,944 [DEBUG] config.py:235 -- Loading Storage backend module: localhost
2023-10-19 12:09:22,944 [DEBUG] localhost.py:36 -- Creating Localhost storage client
2023-10-19 12:09:22,944 [INFO] localhost.py:39 -- Localhost storage client created
2023-10-19 12:09:22,944 [DEBUG] localhost.py:49 -- Creating Localhost compute client
2023-10-19 12:09:22,944 [INFO] localhost.py:58 -- Localhost compute client created
2023-10-19 12:09:22,944 [DEBUG] invokers.py:93 -- ExecutorID 003544-0 - Invoker initialized. Max workers: 1
2023-10-19 12:09:22,944 [DEBUG] executors.py:145 -- Function executor for localhost created with ID: 003544-0
2023-10-19 12:09:22,944 [INFO] invokers.py:107 -- ExecutorID 003544-0 | JobID A000 - Selected Runtime: python3.10
2023-10-19 12:09:22,947 [DEBUG] storage.py:431 -- Runtime metadata found in local disk cache
2023-10-19 12:09:22,948 [DEBUG] job.py:234 -- ExecutorID 003544-0 | JobID A000 - Serializing function and data
2023-10-19 12:09:22,949 [DEBUG] module_dependency.py:66 -- Queuing module 'os'
2023-10-19 12:09:22,950 [DEBUG] module_dependency.py:110 -- Module 'os' is to be ignored, skipping
2023-10-19 12:09:22,950 [DEBUG] module_dependency.py:66 -- Queuing module 'lithops.multiprocessing.util'
2023-10-19 12:09:22,950 [DEBUG] module_dependency.py:110 -- Module 'lithops' is to be ignored, skipping
2023-10-19 12:09:22,950 [DEBUG] module_dependency.py:66 -- Queuing module 'traceback'
2023-10-19 12:09:22,950 [DEBUG] module_dependency.py:110 -- Module 'traceback' is to be ignored, skipping
2023-10-19 12:09:22,950 [DEBUG] module_dependency.py:66 -- Queuing module 'lithops.multiprocessing.process'
2023-10-19 12:09:22,950 [DEBUG] module_dependency.py:110 -- Module 'lithops' is to be ignored, skipping
2023-10-19 12:09:22,950 [DEBUG] serialize.py:84 -- Referenced modules: /home/josep/dev-worksapce/personal/lithops/lithops/multiprocessing/util.py, /home/josep/dev-worksapce/personal/lithops/lithops/multiprocessing/process.py, /usr/lib/python3.10/os.py, /usr/lib/python3.10/traceback.py
2023-10-19 12:09:22,950 [DEBUG] serialize.py:101 -- Modules to transmit: None
2023-10-19 12:09:22,951 [DEBUG] job.py:268 -- ExecutorID 003544-0 | JobID A000 - Uploading function and modules to the storage backend
2023-10-19 12:09:22,952 [DEBUG] job.py:294 -- ExecutorID 003544-0 | JobID A000 - Uploading data to the storage backend
2023-10-19 12:09:22,952 [INFO] invokers.py:171 -- ExecutorID 003544-0 | JobID A000 - Starting function invocation: cloud_process_wrapper() - Total: 1 activations
2023-10-19 12:09:22,952 [DEBUG] invokers.py:176 -- ExecutorID 003544-0 | JobID A000 - Worker processes: 8 - Chunksize: 8
2023-10-19 12:09:22,952 [DEBUG] localhost.py:136 -- ExecutorID 003544-0 | JobID A000 - Putting job into localhost queue
2023-10-19 12:09:22,952 [DEBUG] localhost.py:72 -- Staring localhost job manager
2023-10-19 12:09:22,953 [DEBUG] localhost.py:357 -- Starting Default Environment for /usr/bin/python3.10
2023-10-19 12:09:22,953 [DEBUG] localhost.py:361 -- Setting up Default environment
2023-10-19 12:09:23,298 [DEBUG] invokers.py:252 -- ExecutorID 003544-0 | JobID A000 - Job invoked (0.346s) - Activation ID: 003544-0-A000
2023-10-19 12:09:23,298 [INFO] invokers.py:207 -- ExecutorID 003544-0 | JobID A000 - View execution logs at /tmp/lithops-josep/logs/003544-0-A000.log
2023-10-19 12:09:23,299 [DEBUG] localhost.py:383 -- ExecutorID 003544-0 | JobID A000 - Running 1 activations in the localhost worker
2023-10-19 12:09:23,299 [DEBUG] monitor.py:381 -- ExecutorID 003544-0 - Starting Storage job monitor
2023-10-19 12:09:23,301 [INFO] wait.py:98 -- ExecutorID 003544-0 - Getting results from 1 function activations
2023-10-19 12:09:24,169 [DEBUG] localhost.py:85 -- ExecutorID 003544-0 | JobID A000 - Execution finished
2023-10-19 12:09:24,169 [DEBUG] localhost.py:90 -- Localhost job manager stopped
2023-10-19 12:09:25,352 [DEBUG] monitor.py:127 -- ExecutorID 003544-0 - Pending: 0 - Running: 0 - Done: 1
2023-10-19 12:09:25,352 [DEBUG] monitor.py:409 -- ExecutorID 003544-0 - Storage job monitor finished
2023-10-19 12:09:26,312 [DEBUG] future.py:270 -- ExecutorID 003544-0 | JobID A000 - Got status from call 00000 - Activation ID: c648eeeec4ef - Time: 0.09 seconds
2023-10-19 12:09:26,312 [DEBUG] future.py:290 -- ExecutorID 003544-0 | JobID A000 - Got output from call 00000 - Activation ID: c648eeeec4ef
2023-10-19 12:09:26,313 [INFO] executors.py:593 -- ExecutorID 003544-0 - Cleaning temporary data
2023-10-19 12:09:26,315 [DEBUG] executors.py:503 -- ExecutorID 003544-0 - Finished getting results
Hello World!
2023-10-19 12:09:26,315 [DEBUG] pool.py:202 -- terminating pool
foster999 commented 1 year ago

Sorry, you're right - missed the first part:

2023-10-19 11:24:49,301 [INFO] config.py:141 -- Lithops v3.0.2.dev0 - Python3.11
2023-10-19 11:24:49,301 [DEBUG] config.py:103 -- Loading configuration from /Users/davidfoster/repos/data_engineering_model/.lithops_config
2023-10-19 11:24:49,311 [DEBUG] config.py:176 -- Loading compute backend module: localhost
2023-10-19 11:24:49,311 [DEBUG] config.py:235 -- Loading Storage backend module: localhost
1
2023-10-19 11:24:49,312 [INFO] config.py:141 -- Lithops v3.0.2.dev0 - Python3.11
2023-10-19 11:24:49,312 [DEBUG] config.py:103 -- Loading configuration from .lithops_config
2023-10-19 11:24:49,313 [DEBUG] config.py:176 -- Loading compute backend module: localhost
2023-10-19 11:24:49,313 [DEBUG] config.py:235 -- Loading Storage backend module: localhost
2023-10-19 11:24:49,313 [DEBUG] localhost.py:36 -- Creating Localhost storage client
2023-10-19 11:24:49,313 [INFO] localhost.py:39 -- Localhost storage client created
2023-10-19 11:24:49,313 [DEBUG] localhost.py:49 -- Creating Localhost compute client
2023-10-19 11:24:49,313 [INFO] localhost.py:58 -- Localhost compute client created
2023-10-19 11:24:49,313 [DEBUG] invokers.py:93 -- ExecutorID dc65f6-0 - Invoker initialized. Max workers: 1
2023-10-19 11:24:49,313 [DEBUG] executors.py:145 -- Function executor for localhost created with ID: dc65f6-0
2023-10-19 11:24:49,313 [INFO] invokers.py:107 -- ExecutorID dc65f6-0 | JobID A000 - Selected Runtime: python 
2023-10-19 11:24:49,313 [DEBUG] storage.py:431 -- Runtime metadata found in local disk cache
2023-10-19 11:24:49,314 [DEBUG] job.py:234 -- ExecutorID dc65f6-0 | JobID A000 - Serializing function and data
2023-10-19 11:24:49,315 [DEBUG] module_dependency.py:66 -- Queuing module 'os'
2023-10-19 11:24:49,315 [DEBUG] module_dependency.py:110 -- Module 'os' is to be ignored, skipping
2023-10-19 11:24:49,315 [DEBUG] module_dependency.py:66 -- Queuing module 'traceback'
2023-10-19 11:24:49,315 [DEBUG] module_dependency.py:110 -- Module 'traceback' is to be ignored, skipping
2023-10-19 11:24:49,315 [DEBUG] module_dependency.py:66 -- Queuing module 'lithops.multiprocessing.process'
2023-10-19 11:24:49,315 [DEBUG] module_dependency.py:110 -- Module 'lithops' is to be ignored, skipping
2023-10-19 11:24:49,315 [DEBUG] module_dependency.py:66 -- Queuing module 'lithops.multiprocessing.util'
2023-10-19 11:24:49,315 [DEBUG] module_dependency.py:110 -- Module 'lithops' is to be ignored, skipping
2023-10-19 11:24:49,315 [DEBUG] serialize.py:84 -- Referenced modules: venv/lib/python3.11/site-packages/lithops/multiprocessing/process.py, venv/lib/python3.11/site-packages/lithops/multiprocessing/util.py, .pyenv/versions/3.11.4/lib/python3.11/traceback.py, frozen
2023-10-19 11:24:49,315 [DEBUG] serialize.py:101 -- Modules to transmit: None
2023-10-19 11:24:49,315 [DEBUG] job.py:268 -- ExecutorID dc65f6-0 | JobID A000 - Uploading function and modules to the storage backend
2023-10-19 11:24:49,315 [DEBUG] job.py:294 -- ExecutorID dc65f6-0 | JobID A000 - Uploading data to the storage backend
2023-10-19 11:24:49,315 [INFO] invokers.py:171 -- ExecutorID dc65f6-0 | JobID A000 - Starting function invocation: cloud_process_wrapper() - Total: 1 activations
2023-10-19 11:24:49,315 [DEBUG] invokers.py:176 -- ExecutorID dc65f6-0 | JobID A000 - Worker processes: 10 - Chunksize: 10
2023-10-19 11:24:49,315 [DEBUG] localhost.py:136 -- ExecutorID dc65f6-0 | JobID A000 - Putting job into localhost queue
2023-10-19 11:24:49,315 [DEBUG] localhost.py:72 -- Staring localhost job manager
2023-10-19 11:24:49,315 [DEBUG] localhost.py:360 -- Starting Default Environment for venv/bin/python
2023-10-19 11:24:49,315 [DEBUG] localhost.py:364 -- Setting up Default environment
2023-10-19 11:24:49,411 [DEBUG] invokers.py:252 -- ExecutorID dc65f6-0 | JobID A000 - Job invoked (0.096s) - Activation ID: dc65f6-0-A000
2023-10-19 11:24:49,411 [DEBUG] localhost.py:386 -- ExecutorID dc65f6-0 | JobID A000 - Running 1 activations in the localhost worker
2023-10-19 11:24:49,412 [INFO] invokers.py:207 -- ExecutorID dc65f6-0 | JobID A000 - View execution logs at /private/var/folders/t4/hdpq5bd51tjfws2mq4kj7gqh0000gn/T/lithops-davidfoster/logs/dc65f6-0-A000.log
2023-10-19 11:24:49,412 [DEBUG] monitor.py:381 -- ExecutorID dc65f6-0 - Starting Storage job monitor
2023-10-19 11:24:49,412 [INFO] wait.py:98 -- ExecutorID dc65f6-0 - Getting results from 1 function activations
2023-10-19 11:24:49,689 [DEBUG] localhost.py:85 -- ExecutorID dc65f6-0 | JobID A000 - Execution finished
2023-10-19 11:24:49,690 [DEBUG] localhost.py:90 -- Localhost job manager stopped
2023-10-19 11:24:51,419 [DEBUG] monitor.py:127 -- ExecutorID dc65f6-0 - Pending: 0 - Running: 0 - Done: 1
2023-10-19 11:24:51,419 [DEBUG] monitor.py:409 -- ExecutorID dc65f6-0 - Storage job monitor finished
2023-10-19 11:24:51,421 [DEBUG] future.py:270 -- ExecutorID dc65f6-0 | JobID A000 - Got status from call 00000 - Activation ID: b4e41eb70766 - Time: 0.03 seconds
2023-10-19 11:24:51,421 [DEBUG] future.py:290 -- ExecutorID dc65f6-0 | JobID A000 - Got output from call 00000 - Activation ID: b4e41eb70766
2023-10-19 11:24:51,422 [INFO] executors.py:593 -- ExecutorID dc65f6-0 - Cleaning temporary data
2023-10-19 11:24:51,427 [DEBUG] executors.py:503 -- ExecutorID dc65f6-0 - Finished getting results
Hello World!
2023-10-19 11:24:51,428 [DEBUG] pool.py:202 -- terminating pool
JosepSampe commented 1 year ago

Thanks, as i can see in the logs:

You have 10 workes process (CPUs)

2023-10-19 11:24:49,315 [DEBUG] invokers.py:176 -- ExecutorID dc65f6-0 | JobID A000 - Worker processes: 10 - Chunksize: 10

while the output of CPU_COUNT is 1

2023-10-19 11:24:49,301 [INFO] config.py:141 -- Lithops v3.0.2.dev0 - Python3.11
2023-10-19 11:24:49,301 [DEBUG] config.py:103 -- Loading configuration from /Users/davidfoster/repos/data_engineering_model/.lithops_config
2023-10-19 11:24:49,311 [DEBUG] config.py:176 -- Loading compute backend module: localhost
2023-10-19 11:24:49,311 [DEBUG] config.py:235 -- Loading Storage backend module: localhost
1 **<--- CPU_COUNT**

CPU_COUNT should be 10, so my feeling is that you master branch is not updated to the latest changes

foster999 commented 1 year ago

Thanks for this, I've now uninstalled and reinstalled with pip install git+https://github.com/lithops-cloud/lithops.git@master

CPU_COUNT is showing 10 now and running the original example shows much more involved logs now. Apologies, the install from git must have not upgraded the last time as the version number was the same