Closed sfweller closed 4 years ago
You can use the following code from a notebook to install the packages on a cluster:
import subprocess
def run_cmd(args_list): print('Running system command: {0}'.format(" ".join(args_list))) proc = subprocess.Popen(args_list, stdout=subprocess.PIPE, stderr=subprocess.PIPE) (output, errors) = proc.communicate() if proc.returncode: raise RuntimeError('Error running command: %s. Return code: %d, Error: %s' % ( ' '.join(args_list), proc.returncode, errors)) return (output, errors)
output, errors = run_cmd(['pip3', 'install', 'plotly']) print("install plotly python module") print(output)
So this is the same issue as for matplotlib, where they only work on local data frames. The hooks in local Python for IPython won't work when routed over the Livy job scheduler - that can only return back specific data.
To work around, you can copy the data frame to a local data frame (in this case, 'df'):
%%spark -o df
df = # copy data frame here
Then run local graphing:
%%local
import plotly
import plotly.express as px
tips = px.data.tips()
fig = px.strip(tips, x = "total_bill", y = "time", orientation="h", color = "smoker")
fig.show()
I'm going to close this as an upstream issue - there's little to nothing we can do here to make this work from a code side, unfortunately. It's something we should consider doc'ing though - @ronychatterjee and @yualan could you look into this?
Steps to Reproduce:
import plotly import plotly.express as px
tips = px.data.tips() fig = px.strip(tips, x = "total_bill", y = "time", orientation="h", color = "smoker") fig.show()