sillsdev / harmony

C# CRDT Library for building offline first apps
MIT License
5 stars 0 forks source link
crdt

harmony

A CRDT application library for C#, use it to build offline first applications.

Install

dotnet add package SIL.Harmony

It's expected that you use Harmony with the .Net IoC container (IoC intro) and with EF Core. If you're not familier with that you can take a look at the Host docs. If you're using ASP.NET Core you already have this setup for you.

Prerequisites:

Configure DbContext

EF Core needs to be told about the entities used by Harmony, for now these are just Commit, Snapshot, and ChangeEntitiy

public class AppDbContext: DbContext {
  protected override void OnModelCreating(ModelBuilder modelBuilder)
  {
      modelBuilder.UseCrdt(crdtConfig.Value);
  }
}

[!TIP] SampleDbContext has a full example of how to setup the DbContext.

Register CRDT services

Harmony provides the DataModel class as the main way the application will interact with the CRDT model. You first need to register it with the IoC container.

var builder = Host.CreateApplicationBuilder(args);
builder.Service.AddCrdtData<AppDbContext>(config => {});

[!NOTE] the config callback passed into AddCrdtData is currently empty, we'll come back to that later.

[!TIP] Pay attention to the generic type when calling AddCrdtData, this will be the type of your application's DbContext.

Define CRDT objects

Now that you have the services setup, you need to define a CRDT object. Take a look the following examples

Once you have created your CRDT objects, you need to tell Harmony about them. Update the config callback passed into AddCrdtData

services.AddCrdtData<SampleDbContext>(config =>
{
// add the following lines
    config.ObjectTypeListBuilder
        .Add<Word>()
        .Add<Definition>()
        .Add<Example>();
});

Define CRDT Changes

Now that you've defined your objects, you need to define your changes. These record user intent when making changes to objects. How detailed and specific you make your changes will directly impact how changes get merged between clients and how often users 'lose' changes that they made.

Example SetWordTextChange

public class SetWordTextChange(Guid entityId, string text) : Change<Word>(entityId), ISelfNamedType<SetWordTextChange>
{
    public string Text { get; } = text;

    public override ValueTask<IObjectBase> NewEntity(Commit commit, ChangeContext context)
    {
        return new(new Word()
        {
            Id = EntityId,
            Text = Text
        });
    }

    public override ValueTask ApplyChange(Word entity, ChangeContext context)
    {
        entity.Text = Text;
        return ValueTask.CompletedTask;
    }
}

This is a fairly simple change, it can either create a new Word entry, or if the entityId passed in matches an object that has previously been created, then it will just set the Text field on the Word entry matching the Id.

[!NOTE] Changes will be serialized and stored forever. Try to keep the amount of data stored as small as possible.

This change can either create, or update an object. Most changes will probably be either an update, or a create. In those cases you should inherit from EditChange<T> or CreateChange<T>.

[!TIP] The Sample project contain a number of reference changes which are good examples for a couple different change types. There are also a built in DeleteChange<T>

Once you have created your change types, you need to tell Harmony about them. Again update the config callback passed into AddCrdtData

services.AddCrdtData<SampleDbContext>(config =>
{
// add the following line
    config.ChangeTypeListBuilder.Add<SetWordTextChange>();
    config.ObjectTypeListBuilder
        .Add<Word>()
        .Add<Definition>()
        .Add<Example>();
});

Use change objects to author changes to CRDT objects

Either via DI, or directly from the IoC container get an instance of DataModel and call AddChange

Guid clientId = ... get a stable Guid representing the application instance
Guid objectId = Guid.NewGuid();
await dataModel.AddChange(
  clientId,
  new SetWordTextChange(objectId, "Hello World")
);
var word = await dataModel.GetLatest<Word>(objectId);
Console.WriteLine(word.Text);

[!IMPORTANT] The ClientId should be consistent for a project per computer/device. It is used to determine what changes should be synced between clients with the assumption that each client produces changes sequentially. So if a project is on 2 different computers, each copy should have a unique client Id. If they had the same Id, then they would not sync changes properly.

How the ClientId is stored is left up to the application. In FW Lite we created a table to store the ClientId. It's generated automatically when the project is downloaded or created the first time and it should never change after that.

In case of an online web app there could be one ClientId to represent the server. However, if users can author changes offline and sync them later, then each browser would need it's own ClientId.

[!WARNING] If you were to regenerate the ClientId for each change or on application start, that would eventually result in poor sync performance, as the sync process checks for new changes to sync per ClientId.

Usage

Queries

DataModel is the primary class for both making changes and getting data. Above you saw an example of making changes, now we'll start querying data.

Query Word objects starting with the letter "A"

DataModel dataModel; //get from IoC, probably via DI
var wordsStartingWithA = await dataModel.GetLatestObjects<Word>()
    .Where(w => w.Text.StartsWith("a"))
    .ToArrayAsync();

Harmony uses EF Core queries under the covers, you can read more about them here.

Submitting Changes

Changes are the only way to modify CRDT data. Here's another example of a change

DataModel dataModel;
Guid clientId; //get a stable Guid representing the application instance
var definitionId = Guid.NewGuid();
Guid wordId; //get the word Id this definition is related to.
await dataModel.AddChange(clientId, new NewDefinitionChange(definitionId)
        {
            WordId = wordId,
            Text = "Hello",
            PartOfSpeech = partOfSpeech,
            Order = order
        });

[!WARNING] You can modify data returned by EF Core, and issue updates and inserts yourself, but that data will be lost, and will not sync properly. Do not directly modify the tables produced by Harmony otherwise you risk losing data.

Syncing data

Syncing is primarily done using the DataModel class, however the implementation of the server side is left up to you. You can find the Lexbox implementation here. The sync works by having 2 instances of the ISyncable interface. The local one is implemented by DataModel and the remote implementation depends on your server side. The FW Lite implementation can be found here. You will need to scope the instance to the project as well as deal with authentication.

Once you have a remote representation of the ISyncable interface you just call it like this

DataModel dataModel;
ISyncable remoteModel;
await dataModel.SyncWith(remoteModel);

It's that easy. All the heavy lifting is done by the interface which is fairly simple to implement.

Development

SemVer commit messages

NuGet package versions are calculated from a combination of tags and commit messages. First, the most recent Git tag matching the pattern v\d+.\d+.\d+ is located. If that is the commit being built, then that version number is used. If there have been any commits since then, the version number will be bumped by looking for one of the following patterns in the commit messages: