ipfs / devgrants

The IPFS Grant platform connects funding organizations with builders and researchers in the IPFS community.
166 stars 75 forks source link

Open Grant: Prolly Trees in Rust #257

Closed SionoiS closed 1 year ago

SionoiS commented 1 year ago

Open Grant Proposal: Prolly trees in Rust

Name of Project: Prolly Trees in Rust

Proposal Category: core-dev

Proposer: @SionoiS

Do you agree to open source all work you do on behalf of this RFP and dual-license under MIT, APACHE2, or GPL licenses?: Yes

Project Description

As the amount of data grows in the IPLD ecosystem the need for efficient data-structures becomes more pressing. Databases built on top IPLD data can only thrive with fast indexing and CRDTs are used often in distributed systems but have overhead that centralized solutions don't.

I plan to write an implementation of Prolly Trees with different chunking algorithms in Rust lang. The specification to be followed is being finalized in another grant

Value

Prolly trees are ordered while probabilistically balanced and deterministic in representation, properties crucial in building CRDTs & database indexes.

Prolly trees are the fruit of recent research and may not be practical to implement even if theoretically simple. It is my opinion that it is still worth a try.

Deliverables

Development Roadmap

Start date 3 January 2023

  1. Code an implementation of the specifications in Rust lang. 5 weeks
    • Given an IPFS API; get, insert, remove and iterate over an IPLD Prolly tree.
    • Block representation only, no in-memory representation for simplicity.
    • Up to 2 different chunking algorithm.
  2. Document the code, write tests, optimization and bug fixes. 3 week
    • Work closely with the spec writers and incorporate feedback from IPLD users.
    • Documenting all the tweaks and knobs.
    • Find good default values.
    • Write test fixtures if needed.

Total Budget Requested

  1. Milestone 1 Implementation. 9,000$
  2. Milestone 2 Code documentation, Tests and Feedback. 5,400$

Total 14,400 USD

Maintenance and Upgrade Plans

Since in Rust, the IPFS API has no standard, the implementation would only serve as guide or could become a standard afterward.

Team

Simon-Pierre Vivier (@SionoiS)

Relevant Experience

I implemented a HAMTs based on the IPLD Specs and built my own protocol with IPLD in previous grant work.

Team code repositories

Defluencer Protocol

RangerMauve commented 1 year ago

Hey! Some thoughts:

RangerMauve commented 1 year ago

cc @mikeal @mikolalysenko since you've got a lot of insight on stuff like this from the prolly trees work.

RangerMauve commented 1 year ago

Asked some of the folks I'm working with on #194 and it seems we have a similar approach in our spec but with a different chunking strategy and without the focus on CRDTs (yet).

Would it make sense to join forces somehow? We wanted to get a Rust implementation made from our spec soon ish anyway.

SionoiS commented 1 year ago

How does this relate to Prolly Trees and https://github.com/ipfs/devgrants/pull/194 ?

Reading on the subject now. Both seams to have the same properties.

Would it make sense to add another week at the end of each to get feedback from IPLD folks and others so that the spec can be merged more smoothly?

More feedback the better! If IPLD folks have some bandwidth for that, good.

Would it make sense to budget for a more extensive example of how to use the MST?

In Rust, the IPLD ecosystem seams not as advanced. Without standard traits, for example the IPFS API, it's harder to not reinvent the wheel every time. I felt that keeping the scope small was best.

Would it make sense to join forces somehow? We wanted to get a Rust implementation made from our spec soon ish anyway.

I could work on that too, no problem.

RangerMauve commented 1 year ago

Had a call today, might be useful to rework this proposal to be about implementing the spec from #194 and I'm into helping find technical sponsors for that (maybe I could be one even?).

We'll also be seeing if we can get @SionoiS on the IPLD team or something adjacent to improve the state of Rust IPLD in general. I'll be meeting with IPLD folks next week to discuss this.

realChainLife commented 1 year ago

@SionoiS will you be reworking the proposal to reflect the outcome of your discussion with @RangerMauve. We'd be happy to review that while additional conversations with the IPLD team are on-going.

SionoiS commented 1 year ago

@realChainLife Doing it now. Sorry for the delay.

The proposal ha been reworked. I'm still open for feedback! @RangerMauve

RangerMauve commented 1 year ago

Hey! We just did an initial PR of the prolly tree spec. https://github.com/ipld/ipld/pull/254

With that in mind, I think the wording for 1.3 should be "implement up to two chunking algorithms" since that might change in the near future.

It might also be good to add an extra week or two of ensuring compliance with the spec by loading fixtures from the golang implementation (pending completion of milestone 2).

One other thing, we're planning on meeting with some Rust developers in the PL ecosystem and get more insight on what we really need from rust-ipld and the overlap in what the FVM currently needs implemented. This might also affect some changes in how the IPLD side of the rust implementation would need to adjust. Might be good to budget a bit of time to do updates (unless we want to leave that to a subsequent devgrant?)

Depending on timelines it could also be that the rust implementation can inform the golang one and the spec depending on constraints in Rust-IPLD. Not sure if wording for dedicating some time to figure that out should be in the proposal, but I think it's stuff that would be useful to budget time for.

Outside of that, this is very fortunate timing and could be a huge use for projects FVM and all the other projects starting to do more Rust in the IPFS ecosystem.

SionoiS commented 1 year ago

Hey! We just did an initial PR of the prolly tree spec. https://github.com/ipld/ipld/pull/254

I'll check it out!

With that in mind, I think the wording for 1.3 should be "implement up to two chunking algorithms" since that might change in the near future.

Ok I'll change that.

It might also be good to add an extra week or two of ensuring compliance with the spec by loading fixtures from the golang implementation (pending completion of milestone 2).

That is part of milestone 2 already, I'll make that specific. The entire project could be bigger but I left out sets intersection/unions/diffs for simplicity. Since I only planned; insert, remove, get and iter the tests are not going to take too much time. I could also write the test fixture if some are missing.

One other thing, we're planning on meeting with some Rust developers in the PL ecosystem and get more insight on what we really need from rust-ipld and the overlap in what the FVM currently needs implemented. This might also affect some changes in how the IPLD side of the rust implementation would need to adjust. Might be good to budget a bit of time to do updates (unless we want to leave that to a subsequent devgrant?)

Depending on timelines it could also be that the rust implementation can inform the golang one and the spec depending on constraints in Rust-IPLD. Not sure if wording for dedicating some time to figure that out should be in the proposal, but I think it's stuff that would be useful to budget time for.

Outside of that, this is very fortunate timing and could be a huge use for projects FVM and all the other projects starting to do more Rust in the IPFS ecosystem.

I feel like it's too early. Once the needs are more actionable I'll write another proposal if needed. I also hate working on vague and long project and sets manipulations could be my next grant work anyway.

I'm still available for Rust IPLD roadmapping.

ErinOCon commented 1 year ago

Thanks, @SionoiS! Do you have a preferred email for discussing the next steps of this grant?

ErinOCon commented 1 year ago

Thanks, @SionoiS! We have reached out by email.

SionoiS commented 1 year ago

@ErinOCon I made a mistake it's defluencer (at) protonmail (dot) com

Sorry about that!

ErinOCon commented 1 year ago

Hi @SionoiS, thank you for the update! The email has been resent.

maceip commented 1 year ago

went down a rathole today and startd writing rust bindings for https://github.com/kenlabs/go-ipld-prolly-trees

would have probably been faster to just cleanroom implement it in rust, /sighffi

https://github.com/maceip/prollyrs

DougAnderson444 commented 7 months ago

@ErinOCon @SionoiS I am looking for the resulting Rust code for this grant, is it available somewhere? Mind linking it? Thanks!

SionoiS commented 7 months ago

@DougAnderson444 https://github.com/Defluencer/rust-defluencer/tree/develop/defluencer/src/indexing/ordered_trees