qri-io / qri

you're invited to a data party!
https://qri.io
GNU General Public License v3.0
1.11k stars 66 forks source link

feat(collection): expand `WritableSet` interface, add option to pass in custom `WritableSet` #1890

Closed ramfox closed 3 years ago

ramfox commented 3 years ago

Because we need to repeat many of these handler/event patterns in cloud, we are dividing up the Collection to make it more reusable.

The Collection is now a struct that contains a WritableSet. The Collection is responsible for listening to events and calling various Put-like functions on the WriteableSet when certain events are handled.

To keep things moving, rather than refactor a ton about the collection package, first I'm keeping these Put-like functions as close as possible to the methods that already existed on localSet. These can most likely be narrowed in future prs.

The spec package as been remove, and the tests moved into the collection packages itself. Although, AssertWritableCollectionSpec with some tweaking and additional tests should probably be moved back into a spec package, especially once the interface settles.

dustmop commented 3 years ago

Alright, I have some suggestions on how this could work. Feel free to update / tweak / modify / ignore as needed.

1) The newly introduced wrapper called collection.Collection should be renamed to SetMaintainer (I am not at all tied to this name, there is probably something better) as what it is doing is hooking up the event bus to a Set. Move it out of the collection package and into some other directory (maybe lib/collection_set.go? I am not sure). This is done because SetMaintainer is one specific implementation rather than generic collection functionality. After that, the collection package does not know about event at all, which is nice for cloud.

2) lib/lib.go constructs this SetMaintainer and holds onto it, but lib/scope.go should just have a method called CollectionSet that returns a collection.Set. The users of scope don't know that their set is subscribed to events.

3) Combine the Set and WritableSet interfaces to get this:

// Set is a grouping of all users' collections that this node knows of
type Set interface {
    // List the collection of a single user
    List(ctx context.Context, pid profile.ID, lp params.List) ([]dsref.VersionInfo, error)
    // Get info about a single dataset in a single user's collection
    Get(ctx context.Context, pid profile.ID, initID string) (dsref.VersionInfo, error)
    // Put adds a list of `dsref.VersionInfo`s to a user's collection
    PutList(ctx context.Context, pid profile.ID, item []dsref.VersionInfo) error
    // Add adds a single dataset to a user's collection
    Add(ctx context.Context, pid profile.ID, add dsref.VersionInfo) error
    // RenameUsername changes a user's name
    RenameUsername(ctx context.Context, pid profile.ID, newUsername string) error
    // UpdateEverywhere updates a dataset in all collections that contain it
    UpdateEverywhere(ctx context.Context, initID string, mutate func(vi *dsref.VersionInfo)) error
    // Delete removes a single dataset from a single user's collection
    Delete(ctx context.Context, pid profile.ID, removeID string) error
}

changes of note:

4) Note that this interface matches pretty closely to how SQL operates. Hopefully this means it ends up being a good match for how cloud wants to operate.

One final mention, operations like RenameUsername, which used to be UpdateUsernameAcrossAllCollections makes me think we should be denormalizing data. That is, each reference in the collection set shouldn't contain usernames themselves. Rather, they should be using profileIDs that are then mapped using another table or storage mechanism to the usernames. Then a username rename means modifying only a single piece of data rather than iterating over every element to modify just one field. This is why I avoid the Everywhere suffix in the method name. I don't think this is how things work right now, but I do think it's something we should aspire to.

Arqu commented 3 years ago

I'm down with all of this, will leave final call to Dustin as he reviewed initially. I love the cleanup, and particularly the transition to profile.ID. Clouds arch nemesis is working off of usernames and refs. As far as I can see and compare, this should be a pretty smooth transition on the cloud side.

b5 commented 3 years ago

@dustmop I've flipped the assignee to you, lmk if you want to work through deets!