jupyter / spreadsheet

A spreadsheet component for phosphor
BSD 3-Clause "New" or "Revised" License
5 stars 3 forks source link

Goal of this repo. #3

Open Carreau opened 9 years ago

Carreau commented 9 years ago

This is mostly to discuss, trying to understand what is the goal of this repo. We often said that we'd like to get a widget to edit/understand Excel-Like data structure, I'm sure you spoke of that with @ellisonbg.

I still have some concern of what we are trying to achieve there and if you (@svurens) have been given a clear goal of what to do.

More precisely, I think that spreadsheet have a certain amount of "flaws", that lead to recent scandals on wrong scientific studies, that we want to try to address.

If we don't layout these flaw now, I am afraid we will "just" reimplement a classical spreadsheet with all the default.

So can we layout a document that describe what we are trying to do ?

takluyver commented 9 years ago

Ping @fperez, because I know he has some strong feelings about spreadsheets.

We have talked about building some kind of spreadsheet-like tool, but AFAIK we haven't really talked about the design. We don't (I assume) want to end up reimplementing Excel in HTML and JS.

fperez commented 9 years ago

It's great to see experiments in this direction, but indeed, I'd like to have a chance to brainstorm together about the design problem in this space...

Indeed, the issue is not a reimplementation of the classic spreadsheet functionality, but rather a rethinking of what it means to manipulate tabular data. The fundamental problem with spreadsheets as they currently exists, and that leads to many of the deep issues they have regarding reproducibility, validation, etc, is because they conflate:

Those four things should be distinct elements, it should be possible to validate the code with different inputs, to see the output of a spreadsheet independent of how it's going to be formatted for display, to isolate the aspects of manual entry from the programming, etc. But in a spreadsheet as they exist today, all that is one big salad.

The trick is how to fix these problems without "throwing away the baby with the bathwater". We all, even aware of their serious flaws, still use spreadsheets, because of their major convenience. So the challenge we face is how to retain that convenience and fluidity, while providing substantial improvements at the key points.

There is a large amount of literature on this problem, and I think it would be very useful if we take the time to think a little bit about the bigger problem. This one isn't so much an issue of JS implementation, but rather of what is the right computational model we'd like to expose. We have the freedom to experiment, and @minrk and I already had some very nice impromptu conversations one afternoon. But I'd like us to take a moment to understand the problem we're trying to solve, which is not a feature-by-feature replication of a spreadsheet.

Courtesy of @pbstark, I have here a few great links on the matter:

So let's make sure we actually think carefully about the problem we are trying to solve.

ellisonbg commented 9 years ago

Thanks for starting this!

I have a few goals for this repo:

Cheers,

Brian

On Fri, Jul 3, 2015 at 2:07 PM, Fernando Perez notifications@github.com wrote:

It's great to see experiments in this direction, but indeed, I'd like to have a chance to brainstorm together about the design problem in this space...

Indeed, the issue is not a reimplementation of the classic spreadsheet functionality, but rather a rethinking of what it means to manipulate tabular data. The fundamental problem with spreadsheets as they currently exists, and that leads to many of the deep issues they have regarding reproducibility, validation, etc, is because they conflate:

  • data entry and manual cleaning.
  • code execution
  • output
  • formatting and reporting.

Those four things should be distinct elements, it should be possible to validate the code with different inputs, to see the output of a spreadsheet independent of how it's going to be formatted for display, to isolate the aspects of manual entry from the programming, etc. But in a spreadsheet as they exist today, all that is one big salad.

The trick is how to fix these problems without "throwing away the baby with the bathwater". We all, even aware of their serious flaws, still use spreadsheets, because of their major convenience. So the challenge we face is how to retain that convenience and fluidity, while providing substantial improvements at the key points.

There is a large amount of literature on this problem, and I think it would be very useful if we take the time to think a little bit about the bigger problem. This one isn't so much an issue of JS implementation, but rather of what is the right computational model we'd like to expose. We have the freedom to experiment, and @minrk https://github.com/minrk and I already had some very nice impromptu conversations one afternoon. But I'd like us to take a moment to understand the problem we're trying to solve, which is not a feature-by-feature replication of a spreadsheet.

Courtesy of @pbstark https://github.com/pbstark, I have here a few great links on the matter:

So let's make sure we actually think carefully about the problem we are trying to solve.

— Reply to this email directly or view it on GitHub https://github.com/jupyter/spreadsheet/issues/3#issuecomment-118425617.

Brian E. Granger Cal Poly State University, San Luis Obispo @ellisonbg on Twitter and GitHub bgranger@calpoly.edu and ellisonbg@gmail.com

rgbkrk commented 9 years ago

It would be nice to involve @karissa, who has been working on the dat project and looking to build a spreadsheet like interface with backing kernels and version control for data.

okdistribute commented 9 years ago

YES! Thanks @rgbkrk, I AM in face interested in building something that connects kernels<->version control to fix the spreasheet problem. @Carreau , I agree that there are some serious problems with spreadsheets, particularly with the scandals in science, etc.. which is a huge motivating use case for dat, which we just rewrote and pushed to beta testing.

Have you all seen flatsheet? It's what we're recommending right now for a modularized spreadsheet-like frontend that is easily compatible with a variety of other tools.

What's your main plan for being compatible? It'd be great if you were able to publish the work to npm as a suite of modular, standalone components.

Carreau commented 9 years ago

Copy-Paste Tracking: Fixing Spreadsheets Without Breaking Them pdf

http://homepages.cwi.nl/~storm/publications/iclc2015.pdf