dirigeants / klasa

A class remix of the Komada Bot Framework
https://klasa.js.org
MIT License
201 stars 87 forks source link

Settings: Breaking Changes #425

Open UnseenFaith opened 6 years ago

UnseenFaith commented 6 years ago

In light of a few issues brought up to us, we're considering changing some interfaces for Settings. This would be a breaking change, but would make the interface more predictable/consistent with other Klasa practices (such as optional parameters going inside an optional object).

To minimize the pain of having multiple breaking PRs back to back, we're going to create this issue, along with the settings branch to streamline this process. All of the PRs will be squashed into this branch, and once everything has been approved, the branch will be merged into master.

If there is anything you want to see changed about settings, fixed, removed, etc., comment down on this issue below. If it's something worth making, we'll put it on the list of proposed features for the branch.

Once this PR is made and merged, Settings will be blocked from any breaking changes for a month. This doesn't include daily upkeep or bug fixes that might crop up in that time span.

Planned Features

Honorable Mentions (May or May not land)

Tylertron1998 commented 6 years ago

For settings to emit an error - be that a client.emit('error') or a 'settingsUpdateError` - for things like incorrect paths.

An option when calling Settings#update to stop on the first error - if you are updating 10 keys, all of which need to be together, and 2 of them fail, it would be nicer then having to undo the 8 keys and try again.

Skillz4Killz commented 6 years ago

Non-Cached Updates!

One of the biggest downsides to using SG is that it adds a caching layer to everything. If I have a table that I know is going to be massive and want to separate it without needing to cache it, I am forced to do direct to providers calls instead of using SG.

This really hurts the maintainability and consistency of the project as now i have half the project using SG and other half using REQL.

Possible Suggestion

.add('key', 'any', { array: true, configurable: false, cached: false });
UnseenFaith commented 6 years ago

@Skillz4Killz What kind of non-caching are you talking about? Your suggestion says you'd want to take a key away from a schema but what you write says to me you would like to not have the Settings instances cached.

bdistin commented 6 years ago

@Skillz4Killz I assume you are wanting to do something heavy like per guild moderation cases.

My question is, if you are storing data, you are going to want to query and edit that data, not just add to that data. (otherwise, why keep it) So why do you need a special method added to add to that data, separate of getting/manipulating existing data that you would have to do with the provider anyway?

From a broad perspective, Providers are the complete tool for that job and adding another tool to only do a fraction of what's needed doesn't really make sense when you still have to fall back to Providers for other needs that can't be met from Settings.

Skillz4Killz commented 6 years ago

@UnseenFaith @bdistin Yep, heavy guild moderation cases. The only difference I have understood between SG and Providers is that SG adds a cache layer. My idea was to possibly have a property in the schema that tells SG when updating to skip the caching.

Currently, I am forced to use Providers instead of SG.

this.client.providers.default.update('guilds', msg.guild.id, { array: updatedArray });

Instead of doing this because this will cache it

msg.guild.settings.update('array', updatedArray, { action: 'overwrite' });

This causes half my code to be in SG and half to be in providers.

@bdistin Yes, Providers are great but adding a single option to SG could remove all if not most of the Provider code and improve maintainability and consistency a lot. From my point of view, this decreases the number of tools required when working with Klasa since now I only need SG.

P.S. I am really not sure on the best way to do this so feel free to ignore my suggestion if it is a bad way to do it. Just hoping for a way to use SG without cache instead of providers.

kyranet commented 6 years ago

You can already achieve this by not adding that key itself to the schema. Same if you were using SQL, as SettingsGateway is never removing the columns from the database.

Adding that check would make the initial update take longer as it'll need to do an extra check for every key and for every entry, that's a very expensive operation.

Additionally, keys are stored in cache because we know their state from the database, it's most of the time, the same value. Having "uncached" keys would led to a very exotic behaviour that will only led to more issues. For example, we have an equality check algorithm that checks if any of the values have changed before we're updating the database.

The best is to not add them in the schema.

Stitch07 commented 6 years ago

+1 for caching layer. The extra caching layer adds to a lot of RAM usage for large bots, and the cache becomes redundant when Providers such as RethinkDB already implement caching on their end.

Tylertron1998 commented 6 years ago

Again: this is where you use providers directly.

Stitch07 commented 6 years ago

Nope, id prefer having settings without caching because settings handles all the resolving for me.

UnseenFaith commented 6 years ago

Honestly, it doesn't matter what you prefer in this scenario. We've already had a talk externally and internally about this issue and the fact is is that it doesn't make any sense to add this to Settings.

You're already going to have to use Provider.get(), Provider.delete(), to deal with this data if you're not caching it, so why add a option to disable caching in SG when you should realistically be using Provider.update() in the first place? Not to mention that it's going to slow down your updating because of all of the internal checks SG does, when you could simply have a quick and easy Provider.update() call.

TL;DR; You already are using Provider methods to deal with this data outside of SG (if no caching), so there's no reason to use SG to update when you're using Provider methods in the first place.

DevYukine commented 6 years ago

Issue with Fetching Users & updating Settings Currently there is no way to fetch an user and await the db sync at the same time so there might be a race condition since the db entry isn't fetched yet and i try to update it it won't and drop the update when the db entry arrives.

UnseenFaith commented 6 years ago

Considering changing the semantics around .get() on Settings objects.

Rather then doing Auto-Resolve, we could easily let .get() fill the role.

const roleObject = message.guild.settings.get('some.path.to.role'); // full object
const roleID = message.guild.settings.some.path.to.role; // raw data (id in this scenario)

One problem with this would be there would be no automatic fetching since deserializing is a sync operation. You can easily circumvent this by changing which methods you use (just like you normally do currently)

// If you expect you might need to fetch a user
const user = await fetch(message.guild.settings.user);

Give your thoughts or opinions if you would like to see this or don't want to see this.

bdistin commented 5 years ago

I would like to see a way to hook into the initialization of Settings objects. Currently, when you pseudo create the Settings instances in constructors there is really no way to know when that object has been updated with data from the database. (the update event is only when the settings have been updated cross-shard from what I can tell)

Perhaps something such as Settings#init(callback), which will call the callback when the data from the database has updated the Settings values. Open to other ideas of how to accomplish this, just providing an example solution with no thought about how it affects the bigger picture.

kyranet commented 5 years ago

To continue the SG rewrite and its abstraction increase, we should abstract the main id key: the "id length" is different between gateways and it's duct taped using Gateway#idLength, which is not a property of the core Gateway but rather a "magic" property that non-core gateways can use to increase the VARCHAR(size) in SQL databases: https://github.com/dirigeants/klasa-member-gateway/blob/4327c23473974e674c58086f06a590a5d6d7c2c4/src/lib/settings/MemberGateway.js#L53-L63

We should not only standarize this operation (make it abstracted, not hardcoded) by allowing users two things:

Additionally, we could implement native index operations in all gateways, almost all providers we support has support for compounded indexes. For example in MongoDB, RethinkDB, Postgres... The usage of compounded indexes would allow The klasa-member-gateway to be properly done.

NOTE: The GatewayDriver changes break all SQL providers, we need to do this now as it'll improve scalability greatly, else we'll have to wait a month to be able to do more changes.

cfanoulis commented 5 years ago

May sound crazy, but would a class be able to be stored in here?

cfanoulis commented 5 years ago

after some guidance by quantum, you no longer have to consider the above stupid comment

tech6hutch commented 5 years ago

Suggestion: rename the "bwd" directory to something better, e.g., "data" or "database".

kyranet commented 5 years ago

Suggestion v2.0: Make the JSON provider's output folder configurable like the ETF one

Lovinity commented 5 years ago

So I have several ideas for the SG as follows:

  1. Stabilize the SG when it comes to objects. The SG becomes very unstable and unpredictable when using objects.

    • When using a non-array type "any", often, updates to keys in the SG will reflect in the cache, but upon bot reboot, the state of that key reverts back to what it had previously; updates do not get properly saved. This could potentially be a Provider issue (I'm using rethinkdb) rather than an SG issue, but something to investigate.
    • When using an array type "any", adding objects to the array causes unexpected behavior. For instance, let's say the array is empty. I add {1: "stuff"}. This correctly gets added to key 0 of the array. But when I then add {2: "more stuff"}, what ends up happening is key 0 of the array remains the same, but key 1 ends up containing {1: "stuff", 2: "more stuff"} when it should only contain {2: "more stuff"} (since key 0 already has the other object). Furthermore, sometimes an array key will get overwritten, or dropped upon bot reboot.
    • Sometimes when using objects, upon bot reboot, key 0 of the object or array becomes a circular. This behavior was too random for me to be able to pinpoint steps to reproduce.
    • It is not easy / clear how to delete pieces of an object, or objects from an array, via Settings.update. And doing so is also unpredictable. Make a reliable standard for how one can call update to delete a key from an object, or an object from an array.
    • TL;DR: SG needs to be optimized so we can efficiently and reliably use objects in it without having to resort to creating a bunch of folders / folder keys to get stable operation (and folders / folder keys do not allow for dynamic schemas like objects do).
  2. Probably not going to happen, but bring back async SG; or at least allow a way for changing the schema on the fly. This made it convenient when I needed to manipulate the data structure of my SG without relying on unpredictable objects as mentioned in 1.

  3. Stabilize the SG type "any". I've noticed often the type will change itself, resulting in Settings.update calls to sometimes fail. Ensure that once one item is added, the key is type-strict to the type of that item until the key is reset or completely emptied (the exception to this is if array is true; in that case, no type-strict since each array value can be a different type).

  4. Perhaps add the ability to set callbacks on certain keys... code that runs on update when something is added or removed. For instance, let's say I add or remove a role to/from a guild key levelRoles. The callback can be called which would contain code to update everyone's level-based roles automatically whenever levelRoles is changed. That way, I don't have to re-use code or import utilities everywhere an update call to levelRoles is made.

kyranet commented 5 years ago

When using a non-array type "any", often, updates to keys in the SG will reflect in the cache, but upon bot reboot, the state of that key reverts back to what it had previously; updates do not get properly saved.

I highly suggest to give the settings branch a shot, it's a klasa-wide breaking change (specially if you use settings), that should fix the problems with the cache having data that is not in the database.

When using an array type "any", adding objects to the array causes unexpected behavior. For instance, let's say the array is empty. I add {1: "stuff"}. This correctly gets added to key 0 of the array. But when I then add {2: "more stuff"}, what ends up happening is key 0 of the array remains the same, but key 1 ends up containing {1: "stuff", 2: "more stuff"} when it should only contain {2: "more stuff"} (since key 0 already has the other object). Furthermore, sometimes an array key will get overwritten, or dropped upon bot reboot.

You're going to need to give me reproducible code for this, I can't reproduce this bug in either branches.

Sometimes when using objects, upon bot reboot, key 0 of the object or array becomes a circular. This behavior was too random for me to be able to pinpoint steps to reproduce.

Honestly your fault for giving SG circular objects, you know how badly it behaves. Only "fix" I know for this is to JSON.parse(JSON.stringify(data)) so it throws in circulars and serializes objects well. Again, the settings branch should throw in this case, without updating cache nor the database.

TL;DR: SG needs to be optimized so we can efficiently and reliably use objects in it without having to resort to creating a bunch of folders / folder keys to get stable operation (and folders / folder keys do not allow for dynamic schemas like objects do).

May I have your definition of optimization? In terms of performance it's pretty fast, in terms of functionality, it's been working for years.

Probably not going to happen, but bring back async SG; or at least allow a way for changing the schema on the fly. This made it convenient when I needed to manipulate the data structure of my SG without relying on unpredictable objects as mentioned in 1.

Not happening, SQL databases demand the columns to exist, and the schema adding is synchronous, not to mention it adds a huge CPU spike to update every single entry, and makes SG's internals more complicated.

Stabilize the SG type "any". I've noticed often the type will change itself, resulting in Settings.update calls to sometimes fail. Ensure that once one item is added, the key is type-strict to the type of that item until the key is reset or completely emptied (the exception to this is if array is true; in that case, no type-strict since each array value can be a different type).

As I said before, JSON.parse(JSON.stringify()), but the settings branch should throw in this case.

Perhaps add the ability to set callbacks on certain keys... code that runs on update when something is added or removed. For instance, let's say I add or remove a role to/from a guild key levelRoles. The callback can be called which would contain code to update everyone's level-based roles automatically whenever levelRoles is changed. That way, I don't have to re-use code or import utilities everywhere an update call to levelRoles is made.

What do you mean?

Skillz4Killz commented 5 years ago

@Lovinity Hopefully this can help you I had very similar issues with Array and Objects in SG. This should solve a lot of your issues. I think the reason this is frustrating and difficult is because it hasn't been documented like this.

This is how you remove an object inside an array. To replace an existing object when editing it just use edit the object and .update() the same way. If the object is the same it will remove if it is different it will replace.

// Get the exact object from inside the array
const object = settings.array.find(a => a.id === id);
// Get the exact index of that object from inside that array
const index = settings.array.findIndex(a => a.id === id);
// To update we do this where 
await settings.update('array', object, { arrayPosition: index });

Replace the entire array option

// Get the array
const editedArray = settings.array;
// Edit the array
editedArray.push({ id: x });
// Update the array
await settings.update('array', editedArray, { action: 'overwrite' });
tech6hutch commented 5 years ago

A minor improvement: just find the index, and you don't need to find the object then since you already have the index it's at.

tech6hutch commented 5 years ago

A map/collection type would be useful for that, however, so you don't need to use .find but could just .get(a.id)

Lovinity commented 5 years ago

@kyranet

I highly suggest to give the settings branch a shot

I'll check it out when I have a next moment to work more on my bot.

You're going to need to give me reproducible code for this

Noted. I'll compile the code together for you after my finals are complete in a couple weeks.

Honestly your fault for giving SG circular objects

Uh... I'm not giving SG circular objects... they're just randomly appearing sometimes on bot reboot?? Honestly developer-wise the JSON.stringify method is not intuitive; we shouldn't have to stringify objects passed as a parameter to an update function; your code should be able to handle objects. It would be more intuitive to fix this so objects can be passed as a parameter for adding/manipulating data.

May I have your definition of optimization?

I literally defined what I meant after mentioning SG needs to be optimized.

Not happening, SQL databases demand the columns to exist, and the schema adding is synchronous, not to mention it adds a huge CPU spike to update every single entry, and makes SG's internals more complicated.

I can understand for some databases/providers this is true, but for others like MongoDB, it's schema-less.

What do you mean?

Example:

Client.defaultGuildSchema
   .add('levelRoles', 'role', {array: true}, (data, action, result) => callbackStuff());

With a callback defined, every time an update is made to guild.settings.levelRoles, it will call that callback with the data that was provided, what action (add/remove/edit) the SG took, and the result (essentially the return of guild.settings.update containing errors and/or updated);

@Skillz4Killz Thank you! That's good info to know. It's unintuitive to have to do that (though I suppose when it comes to arrays, there isn't really much you can do other than array replacement), but at least it's something.

I really like @tech6hutch 's suggestion. Why not incorporate collections into SG?

tech6hutch commented 5 years ago

The last thing exists in the form of the settingsUpdate event

tech6hutch commented 5 years ago

A map/collection type could actually be done now. Serializing would involve turning it into an array of entries ([[k1, v1], [k2, v2], etc.]), and deserializing would take entries and pass them to new Map/Collection.

UnseenFaith commented 5 years ago

I can understand for some databases/providers this is true, but for others like MongoDB, it's schema-less.

The entire purpose of SG is to seamlessly integrate with each provider/database in the same exact way. That keeps the interface predictable and understandable. Changing it to the way you theorize would cause more headaches than actual good.

That being said, on my settings branch, you now have the opportunity to overwrite core gateways, so if something does not fit your taste, feel free to change it however you like.

kyranet commented 5 years ago

+1 here to remove the array option and abstract it to allow other data structures such as maps and sets. I believe @UnseenFaith had a prototype of this, but I not longer have the code (nor we could find it last time we brought it).

Many of us definitely need this, adding and removing tuples from the array structure is very hard, as Skillz mentioned here, and most of the times we have a dataset that's an array of tuples.

If we abstracted array to dataset, we'd be able to add elements like this:

await settings.update('map', ['key', 'value']);

As the Dataset piece will grab the key and value and do the operations afterwards. Then we'd be able to remove it like this:

await settings.update('map', 'key');

This is far much more convenient than an array of tuples, from Array#find being O(n) to Map#get being O(1), it gets us a drastical performance boost in our bot's runtime, where scalability matters. A usecase for the map dataset would be numerous, and the best example being per-guild tags and triggers.

This feature has been requested by some people, and by me for over a year, I even planned to implement this in SG someday, but it needs to be discussed more and try to find any problem with this proposal.


P.S: @Lovinity your callback idea has many flaws, plus you can use SchemaPiece.filter. SG doesn't need to be optimized, you're confusing terms here. Please try to get reproducible code for the bug you found, and please feel free to try out the settings branch to check if those bugs persist. (Just make sure to do this in a clean bot, the changeset is enough large to break any klasa bot).

Lovinity commented 5 years ago

@kyranet With all due respect, I am not confusing the term "optimize". Optimize literally means to make the best use of a given situation or resource. When I say optimize it for objects, I mean make the best use of those who need to store objects in the SG by addressing the abnormal behavior that occurs when using objects.

However, the above becomes completely void anyway if other datasets can be used in the SG, as is being proposed. Huge +1 for that.

How would filter do the same thing as callback? Filter is meant to be executed before the operation. I'm talking about a callback that happens after the operation. And what are the flaws you speak of? I'll get the code when I can. I have a job and also attend full time classes. I ask for your patience. I will need to set aside time to write up a piece of code that exploits the bug. I have already discarded the previous code I was using that exposed the bug (and I never committed it to github), so I no longer have that and will need to re-write it.

kyranet commented 5 years ago

Filter is run after the parsing (serializers), and before giving Settings greenlight to update. One of the flaws of the callback system is the lack of serializability and deserializability, not to mention the extra overhead, and that result is not available while updating as (at least in the settings branch):

Resolves the overloads -> Resolves all the paths to SchemaPiece instances -> Burst-parse all key-values -> Patch Settings instance -> Emit events -> Resolve.

The resolving (and when filter is called) is during the burst-parse, so that's not possible.

Skillz4Killz commented 5 years ago

@kyranet @UnseenFaith I think what would be really cool to keep a lot of consistency with SG and a lot easier is by having the ability to make dealing with arrays holding objects automated similar to other updates.

Editing a normal key:

await settings.update('key', value)

Editing an array with non-objects (automatically checks if it should add or remove)

await settings.update('key', value)

// If need guild objects just pass guild param. Still very simple
await settings.update('key', value, guild)

Editing an array with objects

// Get the exact object from inside the array
const object = settings.array.find(a => a.id === id);
// Get the exact index of that object from inside that array
const index = settings.array.findIndex(a => a.id === id);
// To update we do this where 
await settings.update('array', object, { arrayPosition: index });

There is a really massive difference and it gets really quite annoying to have to deal with this. I would love to see SG handle this better assuming its possible.

I would try and see if you could do this check inside the update function. Something similar to the below. Please note my way might not be the best way but it does show that it is possible for SG to handle this.

// Assuming this is the settings.update() function add a check if its an object
if (typeof value === 'object') {
    // If no { action: 'add' } is added so do this by default
    if (!options.action) {
        let needToAddValue = true;
        // Maybe even require that objects have `.id` properties
        const index = guild.settings[key].findIndex(a => a.x === value.x);
        if (index >= 0) needToAddValue = false;
        // To update we do this where 
        if (needToAddValue) {} // Add this object
        else {
            // If the objects look the exact same remove it
            if (value === guild.settings[key][index]) {} // remove this object
            // If the objects are different in any way even 1 property replace it
            else {} // Replace this object
        }
    }
}

This would allow every user to be able to update arrays with objects as simple updating anything else.

await settings.update('key', value, guild)
UnseenFaith commented 5 years ago

I already plan to have it that way when I start working on the internal Map/Set/Array support. Thanks for elaborating though.

kyranet commented 5 years ago

To make datasets much easier, I'm working on a rewrite for QueryBuilder so it's easier to read, understand, and extend, for the new elements we'll probably need to insert, in #545.

Also, all the progress in #521 will be deleted and re-done from scratch to rebase it with the latest changes, plus a new API @UnseenFaith and me have to talk about.

cfanoulis commented 5 years ago

Good morning, and my merriest wishes for a codeful 2019 🎄

It has been almost 3 weeks since this issue-"RFC" has had any activity, and I'd like to question the OPs (Faith & kyra), what's the status on this?

kyranet commented 5 years ago

It's paused due to christmas, #545 is undergoing but I believe @bdistin or @UnseenFaith blocked it due to them not liking the new API, though I find it much more intuitive, better, and more JS idiomatic.

gc commented 5 years ago

In reference to #614

Currently, arrays in SG act as sets and not a typical array, and implicitly assume that you don't want any duplicate items added to it, and that if you try to add an item thats already in it - that you want that item removed.

I'd propose either:

In the case no action is passed, (msg.author.settings.update("arrayKey", valueToAdd);), I think I'd prefer it throws an error if you try to update an array without telling SG what your intention for that value is.

kyranet commented 5 years ago

There's force: true if that's what you want, @gc, but I'm not sure it accomplishes what you want, I have to check though.

Skillz4Killz commented 5 years ago

@UnseenFaith make throwOnError enabled by default #708

image

kyranet commented 5 years ago

@UnseenFaith make throwOnError enabled by default #708

image

Fixed in #828.