jupyterlab / frontends-team-compass

A repository for team interaction, syncing, and handling meeting notes across the JupyterLab ecosystem.
https://jupyterlab-team-compass.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
59 stars 30 forks source link

JupyterLab vision for the next few years #80

Open lresende opened 4 years ago

lresende commented 4 years ago

We are seeing aggressive strategy from IDEs to penetrate and/or disrupt the Jupyter Notebook UI arena, in particular JupyterLab. In parallel, we have seen a lot of the cloud vendors enabling Jupyter Notebooks as a service with enhanced UX capabilities that are kept proprietary and not contributed back to the community.

I believe the JupyterLab community should step back and create a 1-3 year vision/strategy plan. This plan would provide directions on areas to focus aiming in maintaining its prominence as the de-facto UI for data science interactive workloads. The vision should also consider the personas we serve, and consider areas that we should prioritize to be more competitive when compared with IDEs providing support for interactive applications integrated with Notebooks.

Thoughts?

goanpeca commented 4 years ago

Thanks for opening this issue @lresende: some thoughts on things that might be important to broaden the scope and facilitate the use cases of many users.

echarles commented 4 years ago

Thx @lresende for opening this, It is IMHO very important.

I don't think trying to imitate an IDE is the way to go. IDE fat-client installation applies to a fraction of the potential users, VSCode is anyway leading the pack and JupyterLab has more to offer and can serve a lot more of different use cases.

I am seeing JLab as a collection of components that can be integrated and used by:

So a collection of client and server components:

On top of that toolbox, JLab would ship 2 reference implementations like:

And would ship easy tutorials to embed those in 3rd party applications.

... and yes, one-click installation would be super-useful (but less important to me than the above).

...and hosted VSCode in the browser looks terribly like powering GitHub CodeSpace (See https://github.com/cdr/code-server for open-source implementation of VSCode as WEB application).

echarles commented 4 years ago

We often compare/refer to VSCode but RStudio is another tool worth looking at. They have fat and web flavors with the exact shining UI: An editor mixing code/text/latex/..., a console, a variable inspector and a bottow-right display for the rest (graph, filetree, help...).

Are we in a state to do that easily? We miss the editor mixing text/code (for now attaching a md or py file to a console is a uncomplete workaround). We don't have a variable inspector component (3rd party ones exist).

Assuming we have those components, we could add a third reference implementation (JStudio :)) aside the 2 ones I have listed (current jlab and classic clone).

This makes me think when Mozilla has broken its monolith distribution (which even included a html editor) into Firefox, Thunderbird... and regain market at that time.

Also, an hidden gem is located in the examples folder too much invisible and not marketed (we link to them sometimes when users ask questions on kernels...). Those examples look too me like the premise of what I would love JupyterLab to be: a reusable set of components to build secure, scalable and collaborative data driven applications in the cloud.

echarles commented 4 years ago

Typing when thinking... I feel that the current way to integrate more and more external extensions into the default distribution has its limits.

blink1073 commented 4 years ago

I feel that the current way to integrate more and more external extensions into the default distribution has its limits.

Yes, we will need a clear strategy about what is in core once it is easier to install extensions. We should be following the data from things like the the 2015 UX Study.

krassowski commented 4 years ago

Should we have a new UX study? Also, please have a look at the study "Whatā€™s Wrong with Computational Notebooks? Pain Points, Needs, and Design Opportunities" from Oregon State University, Microsoft and University of Tennessee-Knoxville. There are points (in interpretation and methodology) I disagree with, but still a valuable contribution.

I think about IDE-like features a lot. I see that there are different paths ahead but I will be working to make the LSP integration as reliable and feature-full as possible (we could certainly use a hand!). Then, there is the ease-of-installation which may be actually a challenge - currently, we require node, python extension, lab extension, and the servers; all have to be properly instaled (most needs to be installed from the command line!) and novices trip on virtual environment issues (e.g local/global installations, using conda without fully comprehending it, etc). Obviously modules federation will be a step forward - but it will not be a perfect solution for LSP extension because some servers still require node...

meeseeksmachine commented 4 years ago

This issue has been mentioned on Jupyter Community Forum. There might be relevant details there:

https://discourse.jupyter.org/t/cripplingly-slow-ui-am-i-the-only-one/5351/7

matthew-brett commented 4 years ago

In case it's useful, a summary of recent StackOverflow IDE rankings. It seems that VSCode is eating quite a few lunches, but particularly Sublime Text, Atom, Eclipse and Notepad++.

blink1073 commented 4 years ago

Another data point: https://www.jetbrains.com/lp/python-developers-survey-2019/

image
blink1073 commented 4 years ago

The "Tools and features for Python development" section of the above survey has a lot of overlap with our 2015 UX Survey results.

blink1073 commented 4 years ago

Also, thanks for putting that together @matthew-brett!

choldgraf commented 4 years ago

a couple of thoughts (thanks @lresende for kicking off this discussion)

I think that if JupyterLab makes progress on both of the above pieces, it would make a huge difference. Incentivize (and make it easier) for people to build extensions on top of the composability of JupyterLab, then make it easier for users to fluidly use those extensions / interfaces. Develop a few "core" interface configurations for JupyterLab (e.g., "notebook UI", "RStudio-like UI", "Dashboard UI") and empower a developer community to figure out what other possibilities are out there.

I'll leave it there for now because I don't want to wall-of-text y'all ;-)

ps: I recommend folks check out some of the conversation in this discourse thread about jupyterlab / classic notebook UI.

In that thread someone brought up Blender as another tool to think of for inspiration. I found the Blender video they linked to give some interesting thoughts about UI design and how it can benefit complex workflows.

saulshanabrook commented 4 years ago

I really appreciate this conversation starting up!

Hot Take

We should figure out how to move on top of VS Code / Theia.

VS Code / Theia

Theia is an Eclipse project that focuses on being open source and in your browser, but building on VS Code.

I believe the TypeFox folks were originally basing things off JupyterLab/Phosphor and have now moved to VS Code/created Theia. See https://github.com/eclipse-theia/theia/issues/6501.

Why?

Because then we could focus more of our energy on the things that makes Jupyter special, like data science, education, pushing the boundaries of interactive computing, and collaboration. And we could leave the parts that are about making an IDE experience (debugging, window layout, extension infrastructure, performance) to the teams with the history and drive and mission to do that, like Microsoft and Eclipse.

How?

In this imaginary future, "JupyterLab" would simply be a certain distribution of Theia with certain extensions and settings included by default.

Why not?

One concern that some people have raised about moving to VS Code is related to the single stakeholder nature in which it is developed and governed.

It would be interesting to hear from the TypeFox and Eclipse folks how they have felt engaging with this and how they see that space.

Also, there isn't notebook support yet in Theia (https://github.com/eclipse-theia/theia/issues/3186 ). The live share feature of VS code is also not present, since it isn't open source (https://github.com/eclipse-theia/theia/issues/2842).

I am curious from other people who use JupyterLab, if there are other major issues from moving to this approach and obviously from the other maintainers how they would feel about it. Or if this has come up in the big institutional adopters (and funders) of JupyterLab, like Bloomberg and JPM. Have you talked internally about switching to VS Code/Theia instead of JupyterLab?

I am also cognizant of the fact that JupyterLab, to me, is a means to an end of exploratory computation and a community of people that I care about. As someone who helps maintain this library, I would like to find a path forward that lets us leverage our effort and time most effectively, by collaborating with other groups.

bollwyvl commented 4 years ago

The idea of "Lab" (vs Studio, or all the other things we tossed out when naming the project) was that we're trying to serve individual scientific computing users, teams, and the global scientific community. That means supporting the scientific method from education up to scientific publishing. The fact that a scientist must increasingly write code in between is relevant, and being able to support a scientist doing that to the extent needed is indeed among the challenges that will cause otherwise-enthusiastic users to abandon ship.

But, when I'm wearing my developer hat: my experience has been that Lab is a more malleable substrate than code. Teams can ship more customized things on it. I don't think doing static (a la jyve) would be possible with the Code baseline.

With my desktop support hat on: we've almost gotten users out of the nodejs woods on 3.x... even some of the language servers can be webpacked (during a build), despite the vscode upstream having wontfixed the dependency issues around it. The walled garden of extensions isn't monetized yet, but there's no telling, and downstreams like theia and vscodium can't actually support all of the extensions, even though they share an upstream. And who even knows if vscode works in Firefox?

Some areas I am less negative about:

Carreau commented 4 years ago
  • Find ways to make the UI flexibly opinionated.

A real clone of the classic notebook with 100% conformance on key-bindings (users I meet often come back with that issue).

Yes absolutely yes for both of those except for the rewrite part). VS Code is itself a really good example, it is first and foremost a text editor, and is it way easier to recommend as it does just that already super well, and on top of that it can blossom into a full feature IDE once you are ready. And @matthew-brett comment shows it well:

VSCode is eating quite a few lunches, but particularly Sublime Text, Atom, Eclipse and Notepad++.

Eclipse is for sure an IDE, but Sublime, Notepad++ and (to a lesser extent) Atom, are way closer to good text editor.

I would try to get a version of JLAB which is focused on one document per browser tab with one of those view much closer to classic notebook, which eventually can let user grow to full lab instead of a rewrite.

I think also that clear repacking/branding of JupyterLab with a set of simple preconfiguration could go a long way to form smaller dedicated community. VSCode is really attractive to developer, but we have a bunch al data scientist in many domains.

It would be good to pick a few extensions and default configuration and specific css theme that target a number of use case, say for example

With their googlable distinct names.

bollwyvl commented 4 years ago

@Carreau Right, going beyond just an installer, here are examples of cross-platform, one-click, non-root installers centered around a Lab configured with different goals, with the full required compute stack underneath:

As a first go-round, officially offering one of these kitted out for training lab developers would probably make a lot of sense: typically takes at most 20 minutes once you have the install media on the users box to get up and running.

choldgraf commented 4 years ago

There is also flybrainlab which uses a customized JupyterLab w/ lots of extensions etc designed for computational / systems neuroscience.

matthew-brett commented 4 years ago

When I'm teaching my students, I tell them they will soon need to follow the advice in "The Pragmatic Programmer"

The editor should be an extension of your hand; make sure your editor is configurable, extensible, and programmable.

https://bic-berkeley.github.io/psych-214-fall-2016/choosing_editor.html

I feel I will have served them poorly, if they take a long time to move out of the Jupyter Notebook interface.

The question I have is - is JupyterLab designed to be that editor, that is "an extension of your hand"? Is the intention that many people will use JupyterLab as their primary editor for text as well as notebooks? If so, how many people use it that way? If not, then when what role should it play?

goanpeca commented 4 years ago

I think is important to make a distinction between the vision for developers and for end-users. Probably the user-oriented vision should take the front seat in this discussion.

matthew-brett commented 4 years ago

@goanpeca - my argument would be that it would be a mistake to concentrate on the "user" at the expense of the "developer" that the user should become, as they continue to learn. If only because, I believe it would be a dangerous place to position JupyterLab, in between the absolute beginner (who is relatively well served by the traditional notebook interface) and someone who wants to become proficient in using scientific code (who should and will start to learn VSCode, or PyCharm or Atom or Vim). The obvious risk is that students go straight from traditional notebook to VSCode or similar, and miss out JupyterLab in the middle.

matthew-brett commented 4 years ago

As a dreadful warning as to what can happen to users who get stuck in the traditional notebook interface - see the "Study 3: interviews with data analysts" section in https://dl.acm.org/doi/abs/10.1145/3173574.3173606 .

choldgraf commented 4 years ago

I think it all comes back to the question: "Is it possible for a highly-flexible and extensible development environment to also be a first-class data science environment? If the answer is fundamentally "yes" then it will be hard to beat out VSCode or whatever projects w/ equal amounts of resources replace it, and we should be re-focus efforts from building an entirely separate parallel web framework to instead trying to make sure the web IDE that wins is the one that has a good open governance and community structure.

If the answer is "no, data science is a different kind of thing from development and warrants its own interface" then I think that's where JupyterLab should position itself. If this is the answer, then I'd urge the JupyterLab community to think about how to encourage and facilitate a flexible transition back-and-forth between JupyterLab and <IDE of user's choice>, with the assumption that we want people to use the best tool for the right job, and JupyterLab will never be the best tool for development. Then, focus the JupyterLab experience in a way that really highlights the "data stuff" and makes it clear why it's the best choice for that workflow, while <IDE of your choice> is the best choice for doing development.

matthew-brett commented 4 years ago

@choldgraf - nice summary. So, to help make that as specific as possible - what aspects of data science workflow can one not easily cover with:

And - once these are clear - how difficult would it be to extend these systems to cover the missing cases?

smackesey commented 4 years ago

I see that this is a team repo so I'm not 100% sure I should be posting, but @saulshanabrook on the Discourse forum suggested that I contribute to this discussion so here's my two cents. The below take is certainly the product of my own biases and not based on a study of JupyterLab's existing userbase. But I haven't seen the general idea proposed, so I'd like to throw it into the arena. For background, I work with JupyterLab mainly in the context of computational neuroscience. I'd also like to preface this by thanking you guys for all your hard work on the platform to date.

I would like to see JupyterLab become the hacker's data science environment. I believe that this niche is (a) open; (b) unlikely to be contested by any of JLab's current competitors; (c) achievable. On top of that, it would be awesome and potentially revolutionize data science work.

In the world of software, there are two broad categories of product:

Hacker-focused. These products are typically extremely customizable, powerful, and performant. But the interface is often complex, lacking in bells and whistles, and somewhat intimidating. Examples include Vim/Emacs text editors, Linux, the mutt email client, Ranger file browser, and many other command line programs. They are almost always open-source.

Average-consumer/enterprise-focused. These products are "friendlier" (at first glance, anyway). They are often "What-You-See-Is-What-You-Get". They have shiny GUIs, animations, and lots of nested menus. But they are often irritatingly inflexible, suffer from limiting and inefficient interfaces, and have maddeningly poor performance due to unnecessary visual effects (compare text-based Ranger to MacOS Finder...). Examples include Microsoft Office Suite, Windows/MacOS, Gmail, etc. The vast majority of the good ones are developed by business-- open-source ones usually have all the flaws listed above but worse, and are uglier to boot.

In the rapidly developing world of general-purpose data science IDEs, there are not yet any clear winners or losers in either space. Indeed, the spaces have not yet been differentiated. Jupyter Notebook (and by proxy JupyterLab) gained significant market share as one of the first apps in this world, but now its starting to face real competition. And there is every reason to expect that this particular software category will follow the same trend as others-- the "user-friendly", "for-the-masses" throne will be claimed by something corporate-backed, like VSCode. JupyterLab won't be able to compete in this space. But, we can also expect that Micosoft won't pursue the "hacker" niche, for the same reasons that corporations rarely pursue this niche in other domains.

So there is a throne waiting to be claimed for the Hacker-focused VIM/EMACS of data science environments. And JLab is well-positioned to pursue this throne. It already has a successful brand, large user base, working product, and contributor community. But, I think there would need to be some major changes for JLab to go this route.

The two most important things are to embrace customizability and focus on core performance. Right now JLab has issues in both these areas. The performance is often weak (as discussed in this Discourse thread, and the customizability and docs regarding extensions are pretty weak.

Finally, here are some specific ideas for steps in the Hacker-focused direction:

I could list more but I'll stop there. What all of the above have in common is that they increase the flexibility of JupyterLab. I believe that there are amazing exploratory data analysis use cases and interface motifs waiting to be discovered. They just have to be enabled by a suitable platform. JLab is both (a) relatively well-positioned to become that platform; (b) unlikely to face much competition in this niche from VSCode et al.

ellisonbg commented 4 years ago

Lots to process here. Will comments more on other aspects, but will start with this:

I believe that VSCode is incompatible with the open source vision of Jupyter, and not a suitable foundation for it. Jupyter has always been community driven and multistakeholder. VSCode is controlled by a single corporate entity. This is manifested in the following ways: 1) The VSCode Marketplace doesn't allow third party applications to use it, which is why Theia and code-server have built and maintained their own extension services, 2) key parts of the code base remain proprietary (real time collaboration, web based versions such as CodeSpaces).

More importantly, the roadmap for JupyterLab (and any other part of Jupyter) should be focused on building things for actual Jupyter users (lab, classic, huh, ipython, etc.). In this org, I believe we should take time to understand what JupyterLab users are doing, what their pain points are, what their needs are, etc. - and use that to drive the roadmap. That was how I read the initial post of @lresende and I believe that is what we should focus on in this thread. It is this focus on our users that led us to build JupyterLab in the first place, and is driving much of the work for 3.0, including the improved extension system, the debugger, and the classic notebook mode.

ellisonbg commented 4 years ago

To practice what I preach, here are the main user focused things that I view as being roadmap worthy:

ellisonbg commented 4 years ago

One way that I have started to prioritize issues in a user-centered manner is to sort them by reaction (comment or emoji):

https://github.com/jupyterlab/jupyterlab/issues?q=is%3Aissue+is%3Aopen+sort%3Areactions

Users would benefit massively if we started at the top and went down the list. The same sorting on the classic notebook also gives a similarly useful signal:

https://github.com/jupyter/notebook/issues?q=is%3Aissue+is%3Aopen+sort%3Areactions

This isn't perfect, but is a great start.

meeseeksmachine commented 4 years ago

This issue has been mentioned on Jupyter Community Forum. There might be relevant details there:

https://discourse.jupyter.org/t/building-uis-on-top-of-theia/5410/1

choldgraf commented 4 years ago

Just a note that I opened up https://discourse.jupyter.org/t/connecting-jupyter-and-theia/5410 to discuss the VSCode/Theia point more broadly because I think it's an interesting topic to dig into, but don't want to derail this thread w/ a protracted discussion here.

choldgraf commented 4 years ago

Just a note that if y'all are interested in prioritizing features by user +1s etc, you may be interested in some infrastructure that we're using in executablebooks/. This feature voting page contains a little table of issues across all executablebooks/ repositories, sorted by šŸ‘ votes. We're going to try and encourage users to use šŸ‘ in this way to vote for things that they want to see.

In case it's useful:

We retrieve the latest data from issues here: https://github.com/executablebooks/meta/blob/master/.github/workflows/retrieve_issues.yml

And here's the JS that creates this little table for use in the docs: https://github.com/executablebooks/meta/blob/master/docs/_templates/layout.html#L15

telamonian commented 4 years ago

Something that came up in the meeting today with the Siemens folks is that notebooks are not good for long running calculations. If there's any kind of disconnection event between the frontend and the kernel the notebook is likely to miss the calculation output. So even if your week-long simulation succeeds, you won't ever know it.

I've wanted more robust handling of long-running cells for years. A few relevant points for getting this feature on our roadmap:

I think getting this to work would help a great deal with making jlab into a truly first class data science environment; alternatively, if I can't rely on notebooks to keep track of my output, that really means that I can never use a notebook as part of a production workflow (at least not without manually handling the output somehow myself).

telamonian commented 4 years ago

As part of expressing our larger vision for the immediate (and medium-term?) future of jlab, I think we should focus on coming up with a list of features that are not feasible for one person to implement (which has been our primary engineering pattern for a while). These would be features, like the "busy cell reconnect" above, which:

Whatever higher-level vision we come up with, I think we need a list like this in order to make said vision concrete and shared

saulshanabrook commented 4 years ago

This feature will require a rework of how jlab/server handles notebook state. My understanding of this for the last year or so has been that anything related to notebook-state management is tied up in the implementation of the RTC features. Is it correct to think that this feature (let's call it "busy cell reconnect") needs to be developed alongside RTC (pinging @saulshanabrook)?

Yeah this is a big part of the RTC work, which really is maybe a misnomer now, because it encompasses both "real time collaborative editing" along with "move the data model server side"

williamstein commented 4 years ago

Some quick thoughts from my experience this domain:

According to @jasongrout's old comment here, @williamstein's cocalc does have a working implementation of correctly reconnecting to a busy kernel

Yes, and the implementation in CoCalc is deeply tied to how RTC works there. Sometimes weird cases still come up that are interesting (example: tons of messages with "clear line" in them for progress bars). I'm willing to try to answer any questions about the choices we made in CoCalc...

The Sage Notebook back in 2007 had "long running reconnect support", even though it did not support realtime collaboration. I remember when implementing it back then that I considered it to be a really, really important requirement for a web-based computational notebook. One reason was that for a while sagenb was really buggy and users would refresh their browser a lot, but of course the other was long running computations, which was what us research mathematicians certainly do a lot of. I was really surprised when Jupyter came out and I found out it didn't do this. My best guess is that people presumably work around this problem in Jupyter by writing output to a file on the filesystem, which -- though tedious -- would solve this problem in some case for people serious about getting research computations done.

The Sage notebook also did a lot of processing of output on the server side. Things that can be big (e.g., images, large output) I think got stored in the filesystem in subdirectories for sagenb. In CoCalc, "things that can be big" get stored in a local sqlite database. You start having to worry more about things that can be big, when you have notebook state fully on the server as well, since you have to worry about memory usage there, and bandwidth whenever the user reconnects.

isabela-pf commented 4 years ago

I have a few things Iā€™d like to follow up on based on previous comments.

First, I noticed @ellisonbg mentioned a JupyterLab roadmap. Is there a public roadmap? Or is this implied somewhere in the governance or in a mission statement or something like that? I havenā€™t personally seen any location where thoughts on strategy or direction are gathered, so itā€™d be great to know if there is something I should be referencing.

Whether or not there is something now, I think a good outcome of this discussion could be to create or maintain something that proposes direction for JupyterLab. There are a lot of thoughtful ideas here that Iā€™d like to see turned into action instead of stagnating on this issue indefinitely.

Second, @matthew-brett said

my argument would be that it would be a mistake to concentrate on the "user" at the expense of the "developer" that the user should become, as they continue to learn

and Iā€™d like unpack that statement a little. Thereā€™s a good chance we arenā€™t using the words user and developer in the same way, because I think developers can be users (and often are in JupyterLab). I donā€™t think serving users means not serving developers. I interpreted @goanpecaā€˜s statement as calling out that on this discussion (and frequently elsewhere in the community) we have ample input of people who actively develop Jupyter projects, but not that of people who use Jupyter projects but are not actively developing them. And missing a perspective usually leads to missing wants and needs and issues.

Iā€™m calling attention to this because Iā€™m wondering if other people feel similarly or if Iā€™m the outlier thinking developers and users arenā€™t mutually exclusive.

goanpeca commented 4 years ago

we have ample input of people who actively develop Jupyter projects, but not that of people who use Jupyter projects but are not actively developing them. And missing a perspective usually leads to missing wants and needs and issues.

Thanks, @isabela-pf! Indeed this is also what I see over and over and it makes us blind to the actual needs of a whole world of users out there.

williamstein commented 4 years ago

we have ample input of people who actively develop Jupyter projects, but not that of people who use Jupyter projects but are not actively developing them.

For what it is worth, with https://cocalc.com we have the opposite, since we are constantly hearing from people (mostly academic faculty) that use Jupyter but who do not have the time to even consider developing Jupyter. We have a "Help" button integrated with CoCalc, where people click on it and it creates a Zendesk ticket. We get many of these every day and some are "wants and needs".

image

Often these "wants and needs" are completely unexpected, and appear with some surprisingly level of frequency. Here's an example of an issue we opened recently as a result, where somebody wanted a new cell type for nbgrader and had a very compelling use case: https://github.com/jupyter/nbgrader/issues/1342

I wonder what would happen if Jupyter classic (or JupyterLab) had a similar 1-click button that maybe created a new Github issue?

Carreau commented 4 years ago

New issue could be pre-filed with info like jlab version and co using url parameters: https://docs.github.com/en/github/managing-your-work-on-github/about-automation-for-issues-and-pull-requests-with-query-parameters coudl be interesting. Problem is you need users to have GH account...

williamstein commented 4 years ago

Problem is you need users to have GH account...

Having some sort of account is valuable so you can follow up with users and find out what they are really saying. Also it cuts down on spam.

matthew-brett commented 4 years ago

@isabela-pf - thanks for the unpacking, that's helpful. I didn't mean to be unduly controversal, and I do agree with you about the continuum of user and developer.

@williamstein - I definitely didn't mean "contributor to Jupyter development" when I was referring to "developers" above.

I only mean that all of us gradually proceed from what might be called a "user" in Jupyter, to become more efficient in writing, and refactoring and testing code. This is just the normal evolution from someone using Shift-Enter a lot, and copy-pasting code cells, through learning to put code into functions, to breaking up code into modules and adding unit tests. The latter stage is still someone analyzing data in Python, but with more experience, more efficient, better at collaborating and so on. That last stage is what I meant by 'developer' - an experienced, effective user of code.

The only thing I was trying to say by contrasting these two, is that there are lots of things that we all learn as we go through this evolution. As for teaching in university, it's very difficult to design something that will serve this evolution well, without having been through this evolution.

So, I'm not suggesting that JupyterLab should only serve the 'developer' in my sense, but that the 'developer', in my sense, will likely have a best idea about what, in retrospect, would have helped them proceed more efficiently from 'user' to 'developer', meaning, from inefficient, and unskilled, to efficient, and skilled. And, conversely, they will know the things that tended to trap them at various stages of that evolution, and make it difficult for them to learn how to get better at their work.

ellisonbg commented 4 years ago

@isabela-pf I guess I was thinking that a roadmap would be the outcome (or action item) of this thread. It doesn't yet exist formally.

A few comments on what I mean by saying our roadmap should be "user focused." Yes, in open source software there are people who both develop and use the software. For Jupyter, we have many millions of users, and very few of them help to develop it or even visit our Github repos. I agree with what others have said - that if we want to be user focused, we have to talk to them, survey them, get input, do UX research, etc. I would love to see us run additional UX research studies to tackle these questions for JupyterLab, or even add a form as CoCalc has done.

At the same time, over the years, the core Jupyter team has interacted with thousands of users in a wide range of contexts: talks, tutorials, online, full blown university classes, contracting, etc. A few years ago, we counted that the core team had given >70 talks and tutorials in a single year. A number of us have taught many hundreds of students in online and in person contexts on a daily basis. The various UX designers/teams at Bloomberg, AWS, Cal Poly, UCI MS Capstone, etc. have run numerous user focused studies covering a wide range of methodologies. Because of this, I think that in aggregate, we do already have an amazing amount of information about what our users pain points and needs are (but yes, let's do more!)

psychemedia commented 4 years ago

we have ample input of people who actively develop Jupyter projects, but not that of people who use Jupyter projects but are not actively developing them. And missing a perspective usually leads to missing wants and needs and issues.

Iā€™m calling attention to this because Iā€™m wondering if other people feel similarly or if Iā€™m the outlier thinking developers and users arenā€™t mutually exclusive.

The following (overlong) comment is not meant as a spoiler or a distractor from the main thread, it's just an observation that for me (someone who values ways of allowing folk to make things interactive through computational tools) Jupyter means a lot of things, but JupyterLab does not count among them.

As a Jupyter user in an education (teaching and learning) setting, where we use notebooks and Jupyter tools to develop simple computational skills potentially across a wide range of subject areas (from basic programming to digital humanities and social science / statistics etc) and touching on data journalism use cases ā€” v small and short term projects that need completing within a couple of hours to produce analyses and assets that lead to, inform or illustrate a particular story ā€” I value:

What's missing for me:

Although I do appreciate that JupyterLab's development may drive development of some of the protocols the things I do use rely on, for my needs, JupyterLab is not relevant to me personally as a user because it is too complex, cluttered, hard to develop in, and moves away from a simple linear narrative UI. (I appreciate that workspaces can be used to simplify and customise environments, but this is seems to me to be to JupyterLab what developing a voila dashboard is from a particular notebook.)The projects I work on and expect computation tools to be used for are short projects (an hour or a day to complete) not large and complex projects involving lots of code and not intended for production. The sort of thing you use a spreadsheet for, or to provide a non-developer with a means of producing rich HTML assets "for free" via _repr_html_. Or provide a way for a not very technical person to share an interactive end user developed tool. (One of the reasons why Excel is powerful is that it supports end user (application) development in various ways and at various levels of sophistication; from sorting and filtering, merging/vlookup, to simple formulas and charts, to conditional formatting, to macros and VB.)

matthew-brett commented 4 years ago

I thought for a while before posting this. Please forgive this demand on your patience.

Although I am sure you would not ask this of me, my qualifications for what I am going to say are as a user and developer of the scientific Python stack since 2005, and a teacher using IPython and Jupyter Notebooks in classes since 2013.

I'm asking only because I am very worried about the question at the top of this thread.

My version of that question is - does JupyterLab need a radical change in direction, in order to avoid becoming irrelevant?

At the moment - I think y'all are planning to continue on something like your current course, but with more feedback from users.

But, returning to the original question, does that current course have a realistic hope of competing for users with - for example - VSCode notebook integration? It seems to me that question is enormously important. If the answer is No, then surely - in order that open governance does not, in practice, perish from the face of interactive computing - this is the time to step back and ask big questions about the direction and fate of the project.

krassowski commented 4 years ago

JupyterLab means a lot to me. As a researcher working with multiple biological omics, JupyterLab is exactly the right tool for the job. My projects are by definition large and complex. Having a convenient way to navigate large collections of notebooks, open them side by side in a dedicated interface, versioning them with the miraculous jupyterlab-git interface (embedded nbdime is just so great), having the same editor available to work on script files - and dozens other small factors contributing to why I think JupyterLab is the best tool for the job. I am biased, having invested a lot in polishing the workflow and improving the extensions I need, but this is just to say that if we continue to gather feedback here we may attract enrichment of dissatisfied users, because why would a fully-satisfied user ever visit JupyterLab's GitHub, team-compass alone?

Please, let's do not make decision on the the future of a tool that many users already use (and in a way were promised that it is "the future") based on opinions of those for whom it is not a useful thing in itself; let's improve the single-document mode for them so they can have 100% of the old notebook experience and seek their feedback, but let's not define the fate of lab based on their feedback - this has to be done based on inclusive and comprehensive surveying, which may be difficult given how diverse the userbase is.

I see a lot of opportunities to turn JupyterLab into even greater tool for research and science, including things that would be quite out of scope in most other editors (except for maybe RStudio and Spyder). Ranging from a dedicated scientific symbol explorer, through formula editor, spellchecker with topic-specific lists of terms (i.e. medical dictionary is a must for me, because spelling of medical terms is difficult for me as a non-native speaker, yet I often use dozens of disease/drug names everyday), improved pipeline viewer (like Elyra, but for local execution highlighting which notebooks need to be re-run after a change has occurred; and allowing to re-run entire branch with a single click; this will be more of reproducible analysis flow tool than the pipeline as meant in data science).

Finally, I believe that the delay in transition from notebook to lab has something to do with public relations; I was waiting for months to retweet a new Medium article announcing JupyterLab 1.0; then JupyterLab 2.0 came along and no article here either. The most recent one "JupyterLab is Ready for Users" is for 0.3x and has outdated graphics, talks about PhosporJS and does not really highlight the strengths JupyterLab has over the classic notebook or alternatives. Then there are articles by attention-seeking bloggers which can misrepresent the state of JupyterLab and due to lack of public announcements get excited with the Netflix's Polynote or VScode developments instead (understandable, VScode it has a wider user base because it is meant for general purpose programming, while "Netflix's >>jupyter-killer<< notebook" sounds more catchy than "this now-mature good od Jupyter project got even better").

For example, the find-and-replace is a killer feature, I dare to say of more interest than the debugger, yet one would have to go to the changelog to learn about it! It is not even visible in the UI, just labelled as "find" and folks are still confused about it today. Potential users who only tried JupyterLab 0.3x may indeed believe that it is a poor interface with much lacking feature parity - but it is no longer the case,

On the PR note JupyterCon is great, but would not it be attended by folks who already are ecosystem-aware already? What about pushing a scholarly communication in a form of a feature article demonstrating the use-cases in various fields to a wider audience of researchers?

On the other hand, JupyterLab has a potential which is sometimes only utilised in external distributions, see Elyra for example (I wish their "script as a prime citizen" approach was adopted by the core). We may not be hearing from the users who are satisfied with such solutions as these are not JupyterLab branded.

lresende commented 4 years ago

Thank you all that spent the time to help with this thread, I will start summarizing the multiple and very valuable feedback and try to come up with a set of areas to prioritize and suggested action items to get reviewed by the community.

I was also thinking to have a biweekly meeting to brainstorm and discuss these types of project directions (I will find a good calendar schedule for this).

Thoughts?

By the way, please continue to use this issue if you have more feedback and/or to add more comments.

matthew-brett commented 4 years ago

@krassowski - thanks - that's a useful insight.

Is it possible, do you think, that JupyterLab will end up ceding the simple notebook interface part of the market to VSCode, and will meanwhile establish itself as the best tool for creating composable data science interfaces?

If so - is that an acceptable outcome?

ellisonbg commented 4 years ago

Matthew, I believe JupyterLab should continue to push hard to serve its users, regardless of what VSCode does. Our users have spoken clearly that they miss the simplicity of the classic notebook. We are responding to that by adding a classic notebook like experience to JupyterLab 3.0 (same URL scheme, separate browser tabs, etc.). It won't be perfect at launch, but the architecture is there for us to iterate quickly on the UX during the 3.0 series. I think it will uniquely position JupyterLab as the only extensible frontend that offers the simple classic notebook experience - while still addressing the usage cases that led to creating JupyterLab in the first place (need for more complex layout.

On Sat, Aug 8, 2020 at 11:20 AM Matthew Brett notifications@github.com wrote:

@krassowski https://github.com/krassowski - thanks - that's a useful insight.

Is it possible, do you think, that JupyterLab will end up ceding the simple notebook interface part of the market to VSCode, and will meanwhile establish itself as the best tool for creating composable data science interfaces?

If so - is that an acceptable outcome?

ā€” You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jupyterlab/team-compass/issues/80#issuecomment-670958571, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAGXUGM4W2FYAYDNET56CTR7WJOJANCNFSM4PGDU2KA .

-- Brian E. Granger

Principal Technical Program Manager, AWS AI Platform (brgrange@amazon.com) On Leave - Professor of Physics and Data Science, Cal Poly @ellisonbg on GitHub

choldgraf commented 4 years ago

A couple of quick thoughts from the conversations above:

I believe that the delay in transition from notebook to lab has something to do with public relations

I agree šŸ’Æ with @krassowski that improving documentation, patterns for users to follow, blog posts, etc would be a huge improvement in the JupyterLab ecosystem. I use Lab every day and pay a lot of attention to Jupyter-land, and I often don't know what's possible, what new features exist, etc. It would be great to find ways to surface more of this information via the community forum, blog posts, etc. I think related to that is documentation about how to use lab / IDEs in ways that complement one another.

JupyterLab needs to strongly define its own narrative and the niche that it wishes to fill in the data science ecosystem, otherwise others will define the narrative for it (e.g., by writing their own blog posts that describe it as a sub-par IDE).

has a potential which is sometimes only utilised in external distributions

Totally agree - I think this is a pattern that we should document better and help users understand. Other links above also show the power of this approach. But, I think many people don't understand that this is even possible, and don't understand that this isn't something you could do with VSCode. (though, you could do this with Theia...I'm still not sure how Theia fits into all of this)

I believe that VSCode is incompatible with the open source vision of Jupyter

I think there are a range of possibilities and relationships between JupyterLab / VSCode / Theia that are worth exploring. One extreme (which I think is what you point out here) would be "JupyterLab is just a VSCode extension" (similar to what the Julia community has done with Juno). I agree this is incompatible with the values of JupyterLab.

However, I think lesser extremes include things like (re)using VSCode components (e.g., Monaco, or supporting MS protocols like the LSP), or building UI elements on top of Theia instead of VSCode. I think it's important to explore these options rather than to write them off. I don't see them as a replacement of JupyterLab, more like "what tools could help JupyterLab developers focus on data science use-cases, and less time on web UI frameworks, while still adhering to Jupyter's mission and values?"

the simple notebook interface part of the market to VSCode

This feels a bit odd to me, as VSCode is definitely not a "simple" interface - it is an IDE. I haven't (yet) seen anything like the "classic" notebook interface in VSCode/Theia.

uniquely position JupyterLab as the only extensible frontend

I think on this point JupyterLab needs to be more clear what it means when it says extensible, because I think there is a difference in definitions that most users don't understand. When I've spoken to VSCode users, they think of it as more extensible than JupyterLab (partially because the developer workflows and patterns are more streamlined and well-documented). If JupyterLab wants to be seen as "the more extensible option" we will need to make that case more strongly and clearly than we have so far. This relates to @krassowski's point about documentation - community docs are crucial so that users know what is actually possible with the technology.

Perhaps it would be useful to have a write up similar to the theia vs. vscode interview. I think many users will not understand the differences between JupyterLab and VSCode unless they're clearly stated somewhere.

matthew-brett commented 4 years ago

This feels a bit odd to me, as VSCode is definitely not a "simple" interface - it is an IDE. I haven't (yet) seen anything like the "classic" notebook interface in VSCode/Theia.

Aha - I'm probably missing some level of complexity. When I open an .ipynb file in VSCode I get an interface that allows me to shift-Enter through a notebook to run code / render Markdown, or press Return inside a cell to edit it. I see the plots and results inline, and so on.

https://code.visualstudio.com/docs/python/jupyter-support

Obviously, I could also do a lot more, because, as you say, it's an IDE, but just for the basic stuff, it looked pretty simple to me.

But returning to your point - maybe that would also be useful - something like - how VSCode interaction with notebooks differs from that of classic Jupyter / Jupyterlab, whether that is structural or incidental, and whether it matters.

williamstein commented 4 years ago

(Disclaimer: I do not speak at all for the Jupyter project. I'm just a "user/developer" that started a company that benefits commercially from Jupyter.) There is some level of complexity involved with even installing VSCode + Jupyter support. It's a little embarrassing to admit, but I've tried multiple times over the last year (on Linux, but also on my Windows 10 laptop), and not got VSCode +Jupyter to work. JupyterLab and Jupyter classic are both easy for me at least to install. It's also easier (both technically and legally) to install and run JupyterLab on a remote server over the web than VSCode, which is one reason there's an easy 1-click way to run JupyterLab in CoCalc, but I don't know if we'll ever get VSCode to run there despite spending significant effort trying (CoCalc uses a base url for hosted web apps, which Jupyter has ridiculously good support for, but VSCode online has no support for).

@ellisonbg's comments above also resonate with me. I imposed a lot of unpleasant discipline on myself when implementing CoCalc-Jupyter to make it mostly look and feel like Jupyter Classic (with imperfect success, of course), just because that's what users wanted from me. I'm thrilled to hear something similar is on the roadmap for JupyterLab. Still Cocalc overall gets regular user complaints for only having a too complicated open ended interface (compared to Jupyter Classic), and we will have to come up with a way of allowing instructors to remove much of the functionality from the default view their students see.

To me, JupyterLab seems like part of the open source universe, where multiple projects that solve overlapping problems can happily coexist; hopefully it's not a winner-takes-all zero sum situation. There's probably at least a dozen viable Jupyter frontend clients right now, and to me that is one of the best things about Jupyter.

@matthew-brett asked: "My version of that question is - does JupyterLab need a radical change in direction, in order to avoid becoming irrelevant? [...] does that current course have a realistic hope of competing for users with - for example - VSCode notebook integration?"

If I do a Google search for "Jupyter" the top hit is jupyter.org, and jupyter.org seems to only mention JupyterLab, JupyterHub and Jupyter classic, and no other Jupyter clients (such as nteract, cocalc, deepnote, colab, databricks, vscode, etc.). My guess is that search traffic can have as much of an impact on competing for users as technical aspects of VSCode notebook integration vs JupyterLab:

image