engineer-man / piston

A high performance general purpose code execution engine.
https://emkc.org/run
MIT License
1.95k stars 251 forks source link

Bad Performance using Python #664

Open raeudigerRaeffi opened 6 months ago

raeudigerRaeffi commented 6 months ago

Hi we are using a self hosted version of piston and we encountered some major limitations with regards to runtime for Python. Given that Piston advertise itself as efficent and fast I assume the issue is with us and not with the software. Our setup is the following: We use the piston docker image with the cli to install python. Then we run sudo /piston/packages/python/3.12.0//bin/pip3 install statsmodels plotly plotly-express scikit-learn in order to install custom libaries. The following enviroment variables are set:

Using this setup the code displayed below takes around 20 secs to execute for 50 data point in os.environ["data"] (On my machine it takes less than a second).

import os
import json
import pandas as pd
import plotly
import numpy as np
import plotly.express as px
data = json.loads(os.environ["data"])
df = pd.DataFrame(data)
df['order_date'] = pd.to_datetime(df['order_date'], format='%d/%m/%Y %H:%M')
fig = px.scatter(df, x='order_date', y='sales', trendline='ols')
graph_json = plotly.io.to_json(fig)\nprint({\"type\":\"plot\",\"variable\":graph_json})
HexF commented 6 months ago

Are you running Piston on the same system as your local test? This could be one factor for the slow performance. This shouldn't have too large of an impact though.

I'm thinking this might be to do with python not caching pyc files for these libraries. This is by design to ensure complete isolation of code with no persistent files across runs.

I would try seeing which lines of code are causing the performance bottleneck. My bets would be on one of the imports