Closed jochym closed 1 year ago
Yes, I am as surprised as you are! This is unfortunate news, and we will have to deal with it.
Collaborative notebooks are still a priority for us. The good news is that we have already been working on our own solution for real-time collaboration that does not depend on the Google Realtime API. Unfortunately, it is still uncertain how long that will take, and what it will ultimately look like.
It is important to note that the Google Drive API is still supported, so we will still be able to upload, share, and access notebooks using Google Drive, it's just the realtime capabilities that will go away (until we get our own solution running).
Without real-time features this will be just a shadow of the current tool. Well, it is better to know a year in advance, I guess. We (developers/community) have our goal for the next year picked for us :-/ As a new user of the lab-with-realtime-colab I can just assure the developers that this is a fantastic tool and it is really appreciated by my fellow users with particular emphasis to the realtime collaboration part.
Glad to hear you are finding it useful, hopefully the transition to the new approach is not too bumpy.
Is Firebase an option? Real-time JSON datastore with access to any level of the tree. Also notifies and broadcasts updates.
The main problem with Firebase, as far as I know, is that it does not have the sophisticated conflict resolution algorithms (like operational transforms or CRDTs) that are required for collaborative text editing. At least, that has been my reading of the documentation. If I am mistaken in that, please do let me know!
Coincidentally, I followed the links on Realtime API warning to use Cloud Firestore and it went to:
https://firebase.google.com/products/firestore/
Will need to read up more on Firebase and Firestore. Seems like the latter is the newer service and may be more applicable.
It seems that unlike Google Drive/Realtime, Cloud Firestore requires a Google API account for billing.
What would be required to at least get real-time running on the same jupyterlab server?
Suppose you and I connected to the same server and opened the same notebook. What would it take for the server to broadcast changes between the two clients?
JupyterLab already allows multiple views into a notebook from within a browser window--so it seem possible for a server hosted file to have it's changes broadcast.
Also, this would allow for the sharing of a kernel.
So far as I have been able to tell, neither Cloud Firestore nor Firebase provide the conflict-resolution capabilities that we need (in addition to being paid products).
We are currently working on a solution that is hosted with the notebook server. At the moment, the server has no APIs for broadcasting changes and resolving conflicts, which is why it is a tricky problem. The case with multiple views in the same browser window works because they are both stored in memory, and the user is not able to make concurrent, conflicting edits.
The intention is that with the self-hosted solution we are working on kernels could be shared.
Firepad is using firebase for the same purpose. https://github.com/firebase/firepad
I believe that they have their own operational transform implementation that they use (rather than something built into Firepad) https://github.com/firebase/firepad/blob/master/lib/text-operation.js
what about https://github.com/P2Pvalue/swellrt?
Look at the Atom package teletype (MIT license) crdt implementation. They cite three papers for their crdt implementation. ( papers are pay walled )
teletype-crdt: The string-wise sequence CRDT that enables peer-to-peer collaborative editing.
teletype-server: The server-side application that facilitates peer discovery.
teletype-client: The editor-agnostic library that manages the interaction with other clients.
https://github.com/atom/teletype
https://github.com/atom/teletype-crdt
https://github.com/atom/teletype-client
https://github.com/atom/teletype-server
Rstudio (paid version) and Cloud9 enable collaborative editing using Ace (switched to BSD license).
https://github.com/ajaxorg/ace
VSCode (MIT license) added collaborative editing to the insiders edition in November. https://github.com/Microsoft/vscode
There's also etherpad's changeset library + server which work well
Hey guys! Check out this post for some options: https://www.quora.com/What-are-good-frameworks-for-real-time-collaboration-in-a-web-application
And since you guys seem to be comfortable with proprietary solutions, I must humbly suggest http://convergence.io as well. It really is the quickest, most reliable path to realtime collaboration.
Just saw this today. Super bummed. I was going to introduce JupyterLab+Google-Drive-Extension to the Intro to Python class I am teaching, but I don't want show anything that will stop working at the end of the year. As for a replacement, I second @stoneyv's suggestion: Atom TeleType. I already use Atom to run Python/R scripts using nteract's Hydrogen plugin: https://nteract.io/atom https://atom.io/packages/hydrogen Hydrogen uses Jupyter kernels for in-line output, code completion, documentation etc.: https://nteract.io/kernels Atom also has excellent git integration (maybe because it is made by GitHub 😀). If Atom TeleType-like collaboration can be implemented into JupyterLab, that would be awesome!
There is another interesting tool http://codestrates.org/ which is based on ShareDB
Codestrates is a literate computing approach to developing interactive software inspired by interactive notebooks such as Jupyter notebook. However, in Codestrates, real-time collaboration is built in, it is possible to create stand-alone applications with persistent state, and to reprogram the functionality of the environment it self.
There is introduction video on their site and demo where you can create a codestrate and play with it.
https://github.com/share/sharedb
ShareDB is a realtime database backend based on Operational Transformation (OT) of JSON documents.
- Realtime synchronization of any JSON document
- Concurrent multi-user collaboration
- Synchronous editing API with asynchronous eventual consistency
- Realtime query subscriptions
- Simple integration with any database - MongoDB, PostgresQL (experimental)
- Horizontally scalable with pub/sub integration
- Projections to select desired fields from documents and operations
- Middleware for implementing access control and custom extensions
- Ideal for use in browsers or on the server
- Reconnection of document and query subscriptions
- Offline change syncing upon reconnection
- In-memory implementations of database and pub/sub for unit testing
Hi,
it's unfortunate that google is deprecating it's API. Have you guys been able to work out a solution? I ask because I am a frequent jupyter user and the possibility of a collaboratory platform for notebooks was surreal for me. How long could it take for moving completely off the Google realtime API?
So I just watched https://channel9.msdn.com/Events/PyData/Seattle2017/BRK11 and was interested in the real time collaboration. First let me say THIS IS A HARD PROBLEM. The work that was done in the video and thus the reason for this thread seems to be in peril due to the deprecation of the real time API. This is unfortunate, but I was coming here to say that I was hoping to drive some conversation around an API that would allow real time without an external connection to google drive. I am hoping for some thought to a self contained API that an organization can host 100% internally. This would help adoption in classified networks, or other areas where things mush be 100% self container and not reliant on a cloud service like google drive. With the deprecation of the API, perhaps we now have an opportunity to consider this use case when pushing the next iteration of real time collaboration.
John, we at Convergence Labs have been building realtime collaborative apps for over a decade, and we've essentially built that API. It is indeed a hard problem. We've run into the vast majority of problems people tend to face over the years, and have wrapped up the solutions into one product. There is the requisite support for data synchronization, but also first-class support for things like remote cursors and selections. We additionally have an on-premise solution for organizations needing to keep their data.
Ian, we've done extensive consulting with software companies getting their feet wet in realtime collaboration, and in the interest of moving forward the state of the art in realtime collaboration apps, we'd be happy to have a conversation about your working solution. Jupyter Labs is one of those cutting-edge apps that we'd love to see succeed, regardless of the underlying technology being used.
I think it'd be important to have an OSS solution for the realtime API work. Then it can't be withdrawn without recourse. It'd also be much easier to install in many environments (many organizations won't want their data outside their organization, and so they'll want to be able to able to do a local install).
Yes, I would also prefer some open source, decentralized solution. Maybe something based on WebRTC, like Atom Teletype or this project: https://github.com/Chat-Wane/CRATE . There are also people working on collaborative editing on top of IPFS, https://ipfs.io/blog/30-js-ipfs-crdts.md
Jupyter uses a central server (jupyter notebook server) so I don't see the interest of a decentralized solution for real-time collaboration.
@piec: I guess it was about self-hosted solution. Not centralized at some particular service provider (google)
@jochym @piec Well, isn't the standard scenario that people are running their own notebook server locally on the machine they are sitting in front of? Does the standard notebook server have a notion of multi-user, or wasn't that the purpose of jupyter hub?
I imagine that I can send someone a link or hash or whatever, and then our two notebook servers connect and we can edit documents together. Like in the Atom editor with Teletype plugin (where Atom as an electron app is basically a web server + browser combined into a desktop program).
@aweisse : I do not think there is really "standard scenario". People use JLab in so many ways. I hardly use it on my laptop - most of my time it is on my dept. server over jhub. I am quite sure there are multiple other schemes.
All: @stas-sl and @manigandham mentioned ShareDB, and just wondering why that's not in the mix.
I'm using an older version in pithy (https://github.com/pithy/dansteingart) and for what I need it to do it's solid. Notebooks are more complex for sure, but ShareDB seems built to handle OT for json docs. Just curious why it's not discussed more here?
Thanks in advance.
@ian-r-rose do you have any links to projects/proposals for the work you mentioned that is happening with respect to in house real-time implementations? I agree that an implementation on the Jupyter server is the route to take, as it has a similar job to the ContentsManager. Potentially even an opt-in API for ContentsManager implementations?
Either way I would love to see how it is progressing/help if I can and so some direction to where that progress is happening would be much appreciated. Thanks!
@ian-r-rose : same as @SpencerPark
@dansteingart : I don't know much about the subject, but ShareDB does look like a good candidate. However, I can't find a lot of information on how it implements conflict resolution -> any idea where explanations can be found ?
The current work in phosphor in the feature-tables
branch is going to provide a CRDT implememtation for real-time collaboration. We hope to start refactoring some of the core JupyterLab APIs to take advantage of this at some point in the next few weeks.
@Ericvulpi ShareDB uses operational transforms (similar to Google Drive, different from CRDTs) to perform its conflict resolution. I investigated it about a year ago, but found the quality of documentation to be so spotty that I was unable to make much progress.
Perhaps this is related to Google's decision to deprecate: https://colab.research.google.com/notebooks/welcome.ipynb
Yes, I think you are absolutely right @dangirsh. They are still clearly using it for their tools (Google Docs, Colab), it is just no longer an open API.
@dangirsh : I must admit their Colaboratory project looks really easy to use and feature rich. I especially like their implementation of the widgets concept with the "add form field" option they just added. It's similar to ipywidgets but more integrated, with direct effect on the code text.
This might be a good replacement https://github.com/conclave-team/conclave. Here is a demo www.conclave.tech and their case study https://conclave-team.github.io/conclave-site/
I'm interested in helping out with this. @ian-r-rose, or anyone else, could you please describe the current state of the effort, both in terms of backend and frontend? Links to WIP branches would be most helpful. And any suggestions for where I can get stuck in.
@lukemarsden I worked on refactoring the atom teletype server to remove external dependencies. I think it would be good to have a jupyterlab client that can talk to the teletype server.
Here are the open pull requests to remove external dependencies: https://github.com/atom/teletype-server/pull/50 https://github.com/atom/teletype-client/pull/67 https://github.com/atom/teletype/pull/393
The server code can be started with a simple docker-compose up -d
command.
So what IS the alternative? Was there ever a decision made on what to replace the google API with?
IPFS and Conflict-Free Replicated Data Types are a really promising option for a long term solution. https://ipfs.io/ipns/blog.ipfs.io/30-js-ipfs-crdts.md
Finally stopped working. Is there any alternative @ian-r-rose ?
No realtime alternative at the moment :( . I have published a new version that removes realtime altogether, so drive integration should still work with this plugin.
Ok. Thanks. Good to know. Is there any place we (the users) should watch for potential new solution?
You can watch here. I hope to have a bit more concrete of an idea in the coming weeks.
I know it was ages since you all posted here, but FYI there are two ongoing efforts to bring collaborative editing to JupyterLab:
The best way to stay up to date is to subscribe to updates on https://github.com/jupyterlab/jupyterlab/issues/5382
@krassowski - great, thanks for letting us know!
This issue has been mentioned on Jupyter Community Forum. There might be relevant details there:
https://discourse.jupyter.org/t/sync-save-notebooks-to-google-drive-like-stackedit/13125/2
This is what I have got a minute ago. Putting an issue out as a starting point for discussion. What now? - the google-drive backed notebooks in the Lab are really fantastic collaboration tool. Is there any similar technology we can migrate to?
Hello Google Realtime API developer,
We’re writing to let you know that after careful consideration, we are deprecating the Google Realtime API as of November 28, 2017.
Your Realtime API client applications will continue to work normally until December 11, 2018. To ensure continued availability after this date, please migrate your applications using Realtime API to another data store before December 11, 2018. You can read more about the deprecation in our Realtime API deprecation documentation.
We know developers have come to rely on Realtime API and that migration may be a significant effort. We are grateful to our developers, and we hope that the deprecation plan summarized below allows a smooth transition for you and your product(s).