jupyter-server / team-compass

A repository for team discussion, syncing, and meeting notes.
https://jupyter-server-team-compass.readthedocs.io
BSD 3-Clause "New" or "Revised" License
14 stars 8 forks source link

Move pycrdt to jupyter-server #55

Closed davidbrochart closed 10 months ago

davidbrochart commented 11 months ago

Context

I have been working a new Python bindings for Yrs, the Rust port of Yjs. In the Jupyter stack, we currently use Ypy. See this issue for the motivations behind pycrdt. I already opened the following PRs that update our stack to pycrdt:

I already merged a PR to update Jupyverse to pycrdt:

But I didn't open the equivalent PR in jupyter-collaboration, mainly because there are pending PRs that may change the code quite a bit, and that would result in a lot of conflicts. If we decide to use pycrdt instead of Ypy, we should move it to either the y-crdt organization or the jupyter-server organization. I would prefer the latter for now (it could go back to y-crdt in the future). That means the new version of ypy-websocket that uses pycrdt should also move to jupyter-server. Are there any objections about moving pycrdt to jupyter-server, along with ypy-websocket?


Proposal

Let's move pycrdt to jupyter-server


Votes

This vote was initiated on 16 November 2023.

echarles commented 11 months ago

What is the motivation for that change.

Are there any quantitative measures of any regressions and/or enhancement that would be introduced by this new library ? (linked to https://github.com/jupyterlab/jupyterlab/issues/14532 and https://github.com/datalayer/jupyter-rtc-test)

davidbrochart commented 11 months ago

What is the motivation for that change.

See https://github.com/y-crdt/ypy/issues/146.

Are there any quantitative measures of any regressions and/or enhancement that would be introduced by this new library ?

No numbers to show, but pycrdt being a mixed Python/Rust project (in contrast with y-py which is pure Rust), it should be slower because the boundary between both languages is crossed more often. On the improvement side, the API is more pythonic and more powerful (e.g. implicit transactions), and the smaller amount of Rust code makes it simpler to maintain IMO.

Zsailer commented 10 months ago

pycrdt looks great, @davidbrochart! I spent some time diving into the details; pycrdt is so clean. Nice work!

We have an (undocumented—hopefully fixed soon) process for migrating repos to this org using the team-compass repo. See the FileID extension as an example.

I'm going to move this issue over there.

Zsailer commented 10 months ago

@jupyter-server/jupyter-server-council take a look at pycrdt and we can vote on bringing this into our org. Any questions or concerns?

davidbrochart commented 10 months ago

There were also some discussions with @dmonad about moving pycrdt to https://github.com/y-crdt, basically replacing ypy with pycrdt, but I'm not sure it is ready for that.

davidbrochart commented 10 months ago

I think pycrdt is now ready to be moved to jupyter-server.


Edited by @afshin to move the vote up to the top-level comment.

ivanov commented 10 months ago

I'm not in the weeds here, so I am abstaining and trusting those more involved to make the right call, but I'll just point out that it seems a unusual to me to create a new library, migrate to it, and then have it folded into the server project, all within a 2 month span. Ypy looks like has had a handful of contributors over the past ~2 years, including David, but pycrdt has been a solo effort by David. I think even if I was more involved here, it would be too early to say if this is an appropriate move. Aside from the technical differences of the two projects, I don't understand the motivation behind the move to jupyter-server organization so soon.

To draw a parallel from Jupyter's own history, from the very beginning of the prototype that became Jupyter, the protocol used the ZeroMQ library, and the Python ZeroMQ bindings were written by our own Brian Granger (@ellisonbg), and our own Min Ragan-Kelley (@minrk) continues to be the primary developer of PyZMQ continues today. But PyZMQ, as a generally useful library outside of what just building interactive computing, wasn't folded into what was then the IPython organization, and instead found its way to the ZeroMQ one. Perhaps pycrdt, too, would best be served by growing outside of just its use in jupyter-server?

davidbrochart commented 10 months ago

It's true that pycrdt is relatively new, but I honestly believe that it has reached a level where I'm confident in the approach. Ypy is quite different, because it's a pure-Rust project. It's probably faster, but at the cost of complexity, to the point that it's basically unmaintained. If we can find someone to maintain it, it would be another story, but I started pycrdt precisely because I couldn't do it. Now pycrdt has features that Ypy doesn't have, which allowed me to demonstrate full server-side notebook execution and state recovery, including widgets, at today's Jupyter Server call. If we agree this is the direction Jupyter should take, Ypy doesn't get us there as of today. Yjs and Yrs authors are in favor of purely replacing Ypy with pycrdt in the y-crdt organization, but I think it would be better to keep these two different implementations. Pycrdt has everything we need in Jupyter, but is currently missing the XML shared type, simply because we don't need it (although it might be simple to add). Also, pycrdt's API is very different (more Pythonic), so we're not only talking about changes under the hood. Moving pycrdt to the jupyter-server organization would allow to build all these new features on top of it. Notebook state recovery has been asked for a very long time, and CRDT-based widgets will open new doors to collaborative editing.

willingc commented 10 months ago

@davidbrochart Thanks for the additional context. A few questions:

Why move into pycrdt into the jupyter server org?

Do we expect this to be the foundation of a server side notebook model?

Do we expect this library to be a dependency of Jupyter server?

willingc commented 10 months ago

I would like to see something added to the README of pycrdt if it moves over that explains the rationale about why this is hosted in the Jupyter server org and when it was migrated to the org.

davidbrochart commented 10 months ago

Why move into pycrdt into the jupyter server org?

To summarize, I think Ypy still has its place in y-crdt, and we don't want two Python implementations there. But Ypy is almost unmaintained due to its complexity, that's why I started pycrdt. I'm adding features to pycrdt that we will need in Jupyter, so I'd like to update our stack to pycrdt. But for that it has to live somewhere else than on my account. The jupyter-server organization is a good place, since CRDTs are used in the backend.

Do we expect this to be the foundation of a server side notebook model?

Yes, I already opened a PR for that.

Do we expect this library to be a dependency of Jupyter server?

No, but it will be a dependency of jupyter-collaboration and ypy-websocket (see this PR). Thus, ypy-websocket will have to move to jupyter-server too.

ellisonbg commented 10 months ago

I have looked through the code and am excited to see us helping to simplify and improve the Yjs Python APIs. Given this is a new effort, I think it is a good situation to treat this as an "incubating project" in Jupyter until we see how it all plays out. I know the original Jupyter incubation process needs to be refreshed by the Software Steering Council, so we can't quite use that as it stands today. When this (incubation) situation came up with Jupyter AI, the JupyterLab council agreed to house Jupyter AI under temporary incubation status until the Jupyter incubation process is refreshed. Would others be open to treating moving pycrdt to the jupyter-server org under this type of temporary incubation status?

On Mon, Nov 20, 2023 at 1:15 AM David Brochart @.***> wrote:

Why move into pycrdt into the jupyter server org?

To summarize, I think Ypy https://github.com/y-crdt/ypy still has its place in y-crdt https://github.com/y-crdt, and we don't want two Python implementations there. But Ypy is almost unmaintained due to its complexity, that's why I started pycrdt. I'm adding features to pycrdt that we will need in Jupyter, so I'd like to update our stack to pycrdt. But for that it has to live somewhere else than on my account. The jupyter-server https://github.com/jupyter-server organization is a good place, since CRDTs are used in the backend.

Do we expect this to be the foundation of a server side notebook model?

Yes, I already opened a PR https://github.com/jupyter-server/jupyter_ydoc/pull/194 for that.

Do we expect this library to be a dependency of Jupyter server?

No, but it will be a dependency of jupyter-collaboration https://github.com/jupyterlab/jupyter-collaboration and ypy-websocket https://github.com/y-crdt/ypy-websocket (see this PR https://github.com/y-crdt/ypy-websocket/pull/89). Thus, ypy-websocket will have to move to jupyter-server too.

— Reply to this email directly, view it on GitHub https://github.com/jupyter-server/team-compass/issues/55#issuecomment-1818428383, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAGXUBPVUZQL3YCOLMPLY3YFMGQ7AVCNFSM6AAAAAA7DS7TFOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMJYGQZDQMZYGM . You are receiving this because you were mentioned.Message ID: @.***>

-- Brian E. Granger

Senior Principal Technologist, AWS AI/ML @.***) On Leave - Professor of Physics and Data Science, Cal Poly @ellisonbg on GitHub

willingc commented 10 months ago

Thanks @davidbrochart for the thoughtful responses.

I agree @ellisonbg re: incubation. I would be fine with a note at the top of the README that indicates incubation status for say 6 months, and moving pycrdt to the org. :sunny:

Moving my vote to approve.

davidbrochart commented 10 months ago

As of today we have 7 "yes" and 1 "abstain", so I guess we can proceed. As I already mentioned, this also means moving ypy-websocket. I'm planning to move it to ypywebsocket, so that ypy-websocketcan still exist in y-crdt and on PyPI.

Zsailer commented 10 months ago

This vote has been open for 3 weeks and the vote count is decisive, so I think we can close it here. Thanks, @davidbrochart!

We have 8 "yes" votes, 0 "no" votes, 2 "abstain" votes, and 4 non-votes.

Let's move forward and transfer pycrdt here. 🚀

davidbrochart commented 9 months ago

I moved pycrdt and pycrdt-websocket.

rgbkrk commented 9 months ago

Exciting! Thanks for pushing forward on pycrdt!

ivanov commented 9 months ago

just want to chime in here that though the project was moved into the jupyter-server organization, it does not yet reflect the "incubation" status that @willingc and @ellisonbg advocated for.

ellisonbg commented 9 months ago

Thanks @ivanov - please indicate the incubating status.

Zsailer commented 9 months ago

Thank you @ivanov and @ellisonbg for catching this! I've opened PRs against both repos with the status set in the README, per @willingc's suggestion 😃