A multi-cloud framework for big data analytics and embarrassingly parallel jobs, that provides an universal API for building parallel applications in the cloud ☁️🚀
I have found it very useful to use Fil for finding the peak memory use of functions running on Lithops.
The way I got it to work was to change the local version of Lithops to add the fil-profile script:
diff --git a/lithops/localhost/localhost.py b/lithops/localhost/localhost.py
index 7ea6d97f..2b310358 100644
--- a/lithops/localhost/localhost.py
+++ b/lithops/localhost/localhost.py
@@ -378,7 +378,7 @@ class DefaultEnv(BaseEnv):
if not os.path.isfile(RUNNER):
self.setup()
- cmd = [self.runtime, RUNNER, 'run_job', job_filename]
+ cmd = ["/Users/tom/opt/miniconda3/envs/barry/bin/fil-profile", "python", RUNNER, 'run', job_filename]
log = open(RN_LOG_FILE, 'a')
process = sp.Popen(cmd, stdout=log, stderr=log, start_new_session=True)
self.jobs[job_key] = process
Then in my function I call Fil's profile API. Each function invocation writes the Fil profile graphs to a local directory (I make sure they are unique by using a UUID in my function).
This is obviously a bit of a hack, so I wonder if there is a way to set the runtime executable - or how hard it would be to make that configurable?
I have found it very useful to use Fil for finding the peak memory use of functions running on Lithops.
The way I got it to work was to change the local version of Lithops to add the
fil-profile
script:Then in my function I call Fil's profile API. Each function invocation writes the Fil profile graphs to a local directory (I make sure they are unique by using a UUID in my function).
This is obviously a bit of a hack, so I wonder if there is a way to set the runtime executable - or how hard it would be to make that configurable?