Open RandomFractals opened 4 years ago
I am surprised this only had one vote and it was a downvote, at least for the Arrow support, as that would clearly be a good way to share data from Jupyter implementations.
@gramster - is your Arrow data stored as a file? Wondering if there was some other mechanism from Jupyter you had in mind.
ha! I forgot about logging this.
@danmarshall would be nice to have both, from file & pipe as described in #213
@RandomFractals - does your extension support piping?
no, but you can call it with data file uri to open data preview similar to how I suggested you integrate vega viewer with SandDance in #153
so, you'd just call it with:
commands.executeCommand('data.preview', dataFileUri)
and you can check if data preview is installed via get commands:
// execute requested data preview command
let viewDataCommand: string = 'vscode.open'; // default
commands.getCommands().then(availableCommands => {
if (availableCommands.includes(this.dataPreviewCommand)) {
viewDataCommand = this.dataPreviewCommand;
}
commands.executeCommand(viewDataCommand, dataUri);
});
see how I do it in vega viewer: https://github.com/RandomFractals/vscode-vega-viewer/blob/master/src/vega.preview.ts#L279
I was thinking of data in the Plasma object store. We had an intern prototype viewing dataframes from the Jupyter notebook in VS Code in SandDance, but that involved (IIRC) serializing the data as CSV and passing it in a URL, which clearly won't scale well. I'm wondering what we could do for large datasets (obviously writing to a file on disk is an option too, and maybe that's all we really need).
yeah, I think to have it scale, writing to disk in raw arrow data format, rather than CSV might be a better option and than have SandDance or some other extension load a user friendly data frame/grid view.
would be nice if vscode had some IPC api for extension integrations and sharing data in memory and arrow is perfect for it. I just don't think we have a vscode api for that yet.
@danmarshall have you looked into this yet?
@RandomFractals no I haven't.
Sorry, I missed this. We aren’t doing anything with Arrow yet, but have been talking about using it in the future for sharing data between kernels in polyglot notebooks.
On Thu, Oct 15, 2020 at 11:22 AM Dan Marshall notifications@github.com wrote:
@RandomFractals https://github.com/RandomFractals no I haven't.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/microsoft/SandDance/issues/154#issuecomment-709507063, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAVCPCCRSFPJOPYLY2JHXXTSK44UVANCNFSM4J3BC2LA .
yeah, @gramster: that's the one scenario where I think we are close to getting it work once you go ga ;)
still, that's only in the context of .net interactive notebooks, or .dib's as you call them :)
I brought it up with vscode team in our last authors feedback monthly call & their stance on this is that extensions can device their own ways of sharing data, i.e. no plans to provide a built-in vscode api for that anytime soon. It did come up a few times in convos with other extension authors in vscode dev community slack.
I think if they added some channels pub/sub, we could see a lot of clever integrations for extensions sharing data beyond notebooks.
see Data Preview 🈸 vscode extension for example of how to integrate those data formats: https://dev.to/tarasnovak/vscode-data-preview-for-devs-around-the-39mn
You can use or peruse my custom Data Manager API & src/data.providers folder for data loading and saving implementation details to enrich SandDance with more data source type choices ...