metauni / metaboard

Multiplayer drawing boards for sharing knowledge in Roblox.
Mozilla Public License 2.0
28 stars 6 forks source link

Persistence format, backups, migration #73

Closed blinkybool closed 1 year ago

blinkybool commented 1 year ago

As metaboard evolves, it will be sometimes necessary to update the format in which the board state is store in the datastore. We might deem in necessary to add more information, remove information or to just change the way that data is serialised before being stored for memory or speed optimisation. We have discussed in the metauni discord what the best method is for existing experiences to migrate to the new format, while preventing against the danger of dataloss.

It is my understanding that the conclusions of that discussion were

What follows are things I believe are under-discussed

Write new format immediately?

When an old format board is seen, should it be immediately overwritten with the new format (assuming it passes all checks)? Or only once it gets modified in-game? The latter is minimally-surprising, in that just starting up your world is a completely innocent operation. However there may be startup time disadvantages to having old-format boards to load in, so migrating everything at once might be preferable.

A good compromise might be to stick with the only-write-when-modified option, and provide an option in the backup/restore system for upgrading old formats to a new format.

Format naming/recognition

The method of format recognition is that from now on, the data stored at the key metaboard<persistId> should be a table that at least has a key _FormatVersion whose value is a string. This identifies which storage format should be used to read this board.

For metaboard v0, the data stored at the key is a string, not a table, and so this should be recognised separately as the Legacy format. This encompasses two previous format versions - the original one, and it's update which seems to be called "v2". They differ only by the fact that v2 stores all of the curve data across multiple keys, whereas the original format could only store what could fit in the first key. They are identical otherwise, so these board formats are read by the same "format reader", which figures out which precise version it is as it's going.

So Legacy := (original | v2) : string.

Note: Persistence format versioning should not be confused with metaboard version. metaboard v0 made use of the original persistence format and persistence format v2, and metaboard v1 begins with persistence format "v3".

Forgetting data between formats

The legacy format contains some extra data which is not currently being carried over to format v3. There is no ClearCount because as far as I can tell, History boards haven't had enough usage to warrant keeping them around. So @dmurfet are you happy if we forget this data?

There is also an AuthorUserId stored with every curve in the legacy format. metaboard v1 currently pays no attention to who drew a particular line, and in a way, neither does metaboard v0. You will find curve:SetAttribute("AuthorUserId", userId) in a few places, but curve:GetAttribute("AuthorUserId") only shows up once when storing the curve with persistence.

I don't know exactly what this is necessary for. Is there value in knowing who drew what figure when looking at the stored board data?

Introducing new data in new formats

There is a need to store an AspectRatio field in the v3 format, so that you can properly view a board without having the model itself which it was drawn on, for example in a web viewer for boards. This wasn't previously stored in the Legacy formats. You might wonder "does this mean we have to choose a default value when converting Legacy to a newer version", but in fact there is no direct "Legacy -> v3" conversion process like that. What happens is the Legacy data is restored into a Board object, which already knows the aspect ratio, then later that Board object is stored in whatever the latest format is.

I mention this because my suggestion in the Write new format immediately? section was to have a direct format converter, and so this would either have a default aspect ratio that it writes (it's always been 4/3 in practice) or it wouldn't write an aspect ratio field at all (so maybe the web-viewer would have to know this 4/3 default).

I bring this up to bring attention to the scenario in general where new formats introduce more information that older formats don't store. Personally I think the best bet is to just not store that new data in the context of a direct converter, and whatever code makes use of that data should treat it as optional.

Manual backup/migrate method

When we are ready to convert all of The Rising Sea and its pockets to metaboard v1, we can make use of code like this to make a backup of all existing keys into a dedicated datastore.

local DataStoreService = game:GetService("DataStoreService")

local persistenceStore = DataStoreService:GetDataStore("metaboardv2.")

local targetDataStore = DataStoreService:GetDataStore("backup-26-07-22")

local listSuccess, result = pcall(function()
    return persistenceStore:ListKeysAsync()
end)
if listSuccess then
    local pages = result
    while true do
        local items = pages:GetCurrentPage()
        for _, v in ipairs(items) do
            local data = persistenceStore:GetAsync(v.KeyName)
            targetDataStore:SetAsync(v, data)
        end
        if pages.IsFinished then
            break
        end
        pages:AdvanceToNextPageAsync()
    end
    print("Done")
else
    print("Failed", result)
end

In certain cases I've made offline backups of boards by the method of printing out the data onto the console and pasting it into a file.

Separate datastores vs prefix-keys

The current metaboard v1 code looks in a separate datastore when it's run in a pocket (when metaportal and a privateServerId are both present). This differs from the v0 behaviour, where the boards of a pocket exist in the same datastore as the main place, and are distinguished from the main place keys by prefixing them like this ps<placeId>-<pocketCounter>:metaboard<persistId> (as opposed to metaboard<persistId>).

Therefore updating metaboard in pockets will require a migration. I did this successfully for Moonlight Forest using the "manual backup method", by adding a "ps10302055084-1" as the prefix argument to ListKeysAsync. I then wrote all of the entries to a datastore called "ps10302055084-1", with keys that have the prefix removed.

Backup/Restore Tool

I think all of these backup and key moving operations should be as user-facing as possible. It should not be a mystery where data is coming from, and it should not be hard to find it, move it, replace it, and back it up to different sources.

There are existing plugins for reading, importing and exporting to datastores. I paid for one called DataStore Editor by sleitnick, however it seems to crash when I lookup metaboard key with a very long string in it. It has import and export functionality, where exporting means it just dumps a ModuleScript somewhere with the contents return .... The import operation reads such a file and writes it to the datastore.

So I think we want mostly the same functionality, with maybe some additional menus for migrating all keys of one datastore into another, and for converting to newer format versions. Also some method of creating offline backups that can exist outside of Roblox studio would be nice. Maybe just a ModuleScript saved as an rbxm file is sufficient.

I also think it would be valuable to be able to actually see the contents of the board in the plugin window as well. The worst part of the manual method is you do it and then think "hmm I hope that worked...", and "I hope I'm copying what I think I'm copying". This would be rather straightforward, since we can reuse existing metaboard code (FrameCanvas).

Something else important to note is that our backup and restore plugin would need to understand the association between top level keys and their "extra chunk" keys, for when a board is too large to store in a single key.

Here's a sketch of what I have in mind for the plugin. I haven't depicted the interface for copying data to another key/datastore, not sure yet how that would work. metaboard-datastore-editor-sketch

blinkybool commented 1 year ago

Just noticed something. There is no stored order (in the workspace or in the datastore) to the lines of a curve in metaboard v0. Every line is just called "Line", and so the order that they are stored in the json list of a curve is purely whatever order :GetChildren() returned when serialising the curve.

You might think that this means that reading a legacy board into metaboard v1 could result in the line being all messed up, since I am converting that collection of lines into the sequence of points that they join. However since I am simultaneously detecting where the gaps in the lines appear for the mask table, i.e. "does the last line end where the next line begins?", I think it should always look equivalent, even if it gets the order of the points wrong. For example, if there are two lines that were drawn consecutively like (A) --- (B) --- (C), but you read the lines in order AB, XY, BC, then the points list will look like [A, B, X, Y, B, C], but the mask table will say that the lines BX and YB are erased.

I will be testing loading in TRS (with saving disabled) to try and find loading bugs.

blinkybool commented 1 year ago

Closing this for now