jupyter-incubator / sparkmagic

Jupyter magics and kernels for working with remote Spark clusters
Other
1.33k stars 447 forks source link

Communicate non-dataframe variables from livy to local #418

Open sijunhe opened 7 years ago

sijunhe commented 7 years ago

I am aware of the feature that communicate the contents of a dataframe/pandas from livy to local #333 . What about the communication of other basic data structures, like scalars, lists, etc?

aggFTW commented 7 years ago

That's not supported yet, and it's not in a roadmap. Can you please tell us about what you are trying to achieve?

sijunhe commented 7 years ago

The use case would be the same as #333, but just for normal variables on the driver. For example , I train a K means model on spark and store its cluster centers to a numpy array variable. If I need to get the variable to local, I'd need to do silly things of either converting it to a data frame and use -o or printing it out and defining the variable in local.

aggFTW commented 7 years ago

Got it. That is not in the works currently. What kind of user experience would you like to see? Would it just be a parameter sent to the %%spark api?

This would be a good community contribution.

kuppu commented 6 years ago

My vote for prioritizing this feature. I have had to use different workarounds to collect global variables. This feature will also help in visualization, plots requires variables (data, legends, labels...) to be on single node.

%%collect - o

hanyucui commented 4 years ago

Is this already taken care of by #432?