lithops-cloud / lithops

A multi-cloud framework for big data analytics and embarrassingly parallel jobs, that provides an universal API for building parallel applications in the cloud ☁️🚀
http://lithops.cloud
Apache License 2.0
317 stars 105 forks source link

Enable Fil for memory profiling #985

Closed tomwhite closed 6 months ago

tomwhite commented 2 years ago

I have found it very useful to use Fil for finding the peak memory use of functions running on Lithops.

The way I got it to work was to change the local version of Lithops to add the fil-profile script:

diff --git a/lithops/localhost/localhost.py b/lithops/localhost/localhost.py
index 7ea6d97f..2b310358 100644
--- a/lithops/localhost/localhost.py
+++ b/lithops/localhost/localhost.py
@@ -378,7 +378,7 @@ class DefaultEnv(BaseEnv):
         if not os.path.isfile(RUNNER):
             self.setup()

-        cmd = [self.runtime, RUNNER, 'run_job', job_filename]
+        cmd = ["/Users/tom/opt/miniconda3/envs/barry/bin/fil-profile", "python", RUNNER, 'run', job_filename]
         log = open(RN_LOG_FILE, 'a')
         process = sp.Popen(cmd, stdout=log, stderr=log, start_new_session=True)
         self.jobs[job_key] = process

Then in my function I call Fil's profile API. Each function invocation writes the Fil profile graphs to a local directory (I make sure they are unique by using a UUID in my function).

This is obviously a bit of a hack, so I wonder if there is a way to set the runtime executable - or how hard it would be to make that configurable?

tomwhite commented 6 months ago

Closing this as I've been using Memray successfully with Lithops.