Closed 1Computer1 closed 4 years ago
Based on the reactions to this issue, I can see that there is a lot of support for it. No one is yet to respond to it so I will be the first. I will express my opinion based on each of the headings you used.
Poor inheritance The structure of inheritance in a case like this should be consistent. As you pointed out, there are areas where the inheritance has been broken. I mean yeah, it works, but not int he most friendly way.
The design is not extensible There would definitely be situations where developers would want to implement their own caching system. Having a solid default cache system will probably be enough for most, but the ability to customize and extend it would be a great change.
Too much responsibility I can definitely agree that someone that does not know the exact workings of the DataStore methods or who hasn't read the documentation would most likely not know whether they interact with the cache, or with the API itself.
Separating the code that works with the cache and the code that works with the API would be a good change. This can go back to the consistency issue as well. Some DataStores overriding methods that they all share is probably not ideal. Separating them would ensure consistency, and probably also avoid some confusion.
The proposal to remove data stores Your proposed solution is solid, clear and consistent. It quite clearly separates the API methods from the cache methods, in a way that is easy to understand and work with in implementations of the classes. I personally can't think of any ways to improve on your solution right now, but I'm sure others could contribute some helpful ideas towards it.
Benefits All your points here are valid and of which I can agree with. Overall it ensures consistency in a clear way that separates API work from caching, and is more easily extensible and maintainable.
Drawbacks Although the drawback with the extra cache property might be a bit of a pain, I feel as though the benefits outweigh the drawbacks.
Other solutions To be honest, I don't really think there are any other better solutions than the main one you proposed. This is where input from others would be welcomed. The more input and ideas from others the better.
I believe this was shutdown this time last year. I commented on it as well but I think the idea of being able to abstract your caching mechanism would be great(dedicated redis instance).
I believe this was shutdown this time last year. I commented on it as well but I think the idea of being able to abstract your caching mechanism would be great(dedicated redis instance).
1409
This isn't really addressing external caching as such, it's focused on abstracting the cache and the API methods into separate internal structures.
Having said that, with that abstraction in place it may then be easier to extend the library and implement and external caching mechanism for anyone who wants to take on the challenge of doing so.
External caching would still be limited by the concerns expressed in #1409: Discord.js heavily relies on fast synchronous cache access and an external cache would mean that all internal cache access would become async. This would make almost every method in Discord.js become async (even if not using an external cache), severely hampering usability and potentially causing significant performance degradation.
The custom caches that are mentioned here would not be for completely changing the storage medium; rather, they would allow users to have complete control over what and how data is stored on the JS heap (LRU, different map/hashing strategy, etc.).
Would this affect the release date of v12 or is this change a post v12 change?
Where would be the point in releasing v12 with stores if we think that it shouldn't be handled like this? This would be an alternative approach to stores, so should definitely be included in v12 instead of stores
@almostSouji Just to be clear i really do love the idea of this change. I am not opposed to the change, just opposed to doing it before v12.
My concern is about how D.JS is not keeping a healthy cycle of updates for stable D.JS users. Almost 2+ years now v12 has not been released because new things keep being added in for v12. D.JS master has a lot of improvements that are never being merged into stable because v12 isn't released. This change should be done post v12 release in my opinion.
I really don't think it is good to delay v12 which is right around the corner, with only 2 things left on the to-do list. https://github.com/discordjs/discord.js/projects/4
The current solution whether flawed or not is working and heavily tested on a lot of bots already with little to no issues. I believe before breaking v12 release it should be heavily considered by the dev team to release v12 into stable and then make this change. V13 may or may not be here for another 2 years if it follows a similar release as v12. The current solution is good to last for the lifetime of v12 in my opinion. Especially since most of the benefits of this change are improving the maintenance of D.JS and not features/fixes to the framework itself.
Our not so "healthy" update-cycle solely is based on only 2 people having access to release versions to npm, and both are not active on this project anymore. Whenever they are, it's for a very very brief amount of time.
Anyway, the reason we do not want to put this in a later version is because the moment we release v12 we will have to maintain it, probably or maybe again for 2+ years. We already have so many things we regret and still have to maintain in v11, why would we repeat that?
It is definitely not a question of "it works, so why change it" in this case.
You are always going to have something you regret, but have to maintain. Otherwise, major versions wouldn't exist.
V12 hasn't been released because we aren't happy with certain things, so as Crawl pointed out, why release something poorly designed (as Compu pointed out), not because we keep on pushing changes
You are always going to have something you regret, but have to maintain. Otherwise, major versions wouldn't exist.
I didn't say this won't be the case, but the majority of devs that actually still work on this library are in favor of it, so 💁
This has been implement with #3696.
Read about it here if the issue format isn't suitable: https://gist.github.com/1Computer1/0f7b650549dc4974718a269f9d51f98e
What's wrong with the design of data stores?
Data stores as they are currently are hard to understand and fragile to maintain.
We cite three reasons for this: they do not inherit in a sound way from their superclasses, their design is not extensible, and they have too much responsibilty; managing both the cache and the API.
Poor inheritance
Data stores are not the same as collections. We can say that there is an is-relationship for some of the stores, but to say that there is one for all of the stores is incorrect. For example, look at
GuildMemberRoleStore
andGuildEmojiRoleStore
which do not actually store any data themselves but instead takes data from the guild's roles and delegates a_filtered
property to theCollection
methods. Not to mention, they don't even extendDataStore
itself! Suppose one day we decided to changeDataStore
in some way that affect the interface of these stores, but did not update the actual code of these pseudo-stores; it'll be a bad time.Furthermore about
Collection
, consider a data store, sayMessageStore
. Suppose we filtered using the inheritedfilter
method fromCollection
. Should this return a newMessageStore
or a newCollection
? The default behavior, sinceMessageStore
inherits fromCollection
which inherits from JavaScript'sMap
, is to give backMessageStore
. But currently, we have overriden this behavior to give back aCollection
, since after all, what happens if someone were to callMessageStore#fetch
on the filtered store? The cache is quite possibly invalid and could give back bad results. But we have just broken inheritance and that's no good!The design is not extensible
How about custom caches? Whether or not you want these, it is still important that something like it is both possible and easy to do. As of right now, this is impossible, or at the very least, very difficult and possibly hacky. This is because the cache is way too highly coupled with the rest of the code, making it difficult to inject a custom cache in. This will be one of the major point of the proposal, to separate the cache and the API.
Speaking of cache, its hard to do things without cache. Especially with partials being more commonly used, it would be very nice to be able to just switch off some caching to save memory and call the API directly, but still have a high-level API. We have some things like this, such as
GuildMemberStore#ban
, but having it be all available will be extremely useful.Too much responsibilty
Which methods work on the cache and which call the API? Which one does both? Without referencing the documentation, let me know:
delete
remove
removeAll
sweep
get
add
set
Do you think its unfair? Well, so do the people who have to use or maintain the library for the first time! Here's something funny too, in
GuildMemberRoleStore
andGuildEmojiRoleStore
, theset
method fromCollection
is completely overwritten with a completely different method that calls the API. Fun!This name conflict is something that has to be resovled through separation of cache and API as mentioned before. If we add more cache-less methods, it'll become very important that this is done.
To drive the point, what is the difference between a collection and a data store? Should
CategoryChannel#children
be a store? What aboutGuildChannel#members
? These things are both likeGuildMember#roles
andGuildEmoji#roles
in that they're do not actually cache their data and are instead wrappers around existing data. So what's stopping them from becoming a store? This points towards data stores having way too much responsibility that it manages to sneak into things they're not meant to.The proposal to remove data stores
This proposal is quite simple: separate the cache and the API.
To do that, we introduce the idea of a manager class. A mananger class will hold all of the API methods for a certain structure as well as the cache for that structure, if applicable. They will replace the various data stores in the library. So for example,
channel.messages
will be aMessageManager
, thenchannel.messages.cache
will be aCollection
ofMessage
. This will allow an interface that still reads nicely:channel.messages.fetch
channel.messages.delete
(we can do this instead of a silly synonym likeremove
)channel.messages.cache.get
channel.messages.cache.has
Internal code will interact with the cache directly. Methods originally on data stores like
resolve
,add
, etc. can continue to exist on the manager class. To be specific, the criteria to as to what goes on what would be:Collection
methods goes tocache
, nothing more, nothing less.resolve
knows the type of the cache's values,add
knows how to format API data to our structures, etc.For data stores which are not actually data stores, the proposed solution is to make them actually cache. To make it so that data isn't unnecessarily duplicated, we can have a backing field. For example,
GuildMemberRoleManager
would have the_roles
property normally onGuildMember
. Then, thecache
would be a getter to give a readonlyCollection
that uses this backing field. Doing so will allow much more consistency in the codebase and will future-proof us for API changes. This can extend to other miscellaneous collections-but-could-be-data-stores likeChannel#children
orGuildChannel#members
.Special note:
ChannelStore
is an LRU cache. We can simply add a newLRUCollection
class. This is an example of a "custom cache" that this proposal enables.Benefits
There will not be poor inheritance in this proposal, because there won't be inheritance at all! This proposal explicitly is about composition over inheritance.
This proposal is extensible. In the future, we can imagine custom caching via dependency injection. Not only is the cache injectable, but so are the managers themselves! This means we can swap out cache, but also (separately) the API methods, allowing for fine-grain control of how things are done. The control of cache is very important to allow people to bring in new tech or just lower their memory usage. Plus, as mentioned, more methods that do not rely on cache means partials will play better.
And of course, the separation of cache and API makes it easier to reason and maintain the code. The managers have less responsibility to worry about and won't get in a mess with the cache. Not only that, we can add new methods on top without worrying as much about running out of reasonable synonyms for entirely different things.
Drawbacks
An extra property to chain with
cache
. This is a breaking change on a larger magnitude than data stores themselves, since before no one had to addcache
on top to usehas
orget
.There will be confusion for when to use
cache
and when to not. We can introduce this as cache methods vs API methods, which should be understandable for most. If every collection of structures, so things likeCategoryChannel#children
became managers, it should be more obvious going forward: always usecache
when you want access the collection.There can still be some ambiguous names inside the manager classes, such as
add
orcreate
or similar. However, it'll be clear that these methods interact with the API and not the cache, so there's still more separation than before.Other solutions
Revert to pre-data stores
This is not feasible. While it does solve problems #1 and #2, the namespace is mixed together. Not only that, the things modifying the cache and cache itself are too far away. Plus, the user API is not as fluent, which was one of the reasons we switched to data stores in the first place.
For those unaware, pre-data stores looked like
channel.fetchMessage
instead ofchannel.messages.fetch
andguild.ban
instead ofguild.members.ban
.Separate, but put the cache elsewhere
This will solve #3, but now the design is more tightly coupled. The cache does not rely on the manager at all, but the manager relies on the cache. This naturally leads to the cache being a member of the manager. If we put it elsewhere, there will be more circular references. Even worse, if we specifically make the manager a member of the cache, not only will we get the previous problem, we'll run into #1 again too.
Try to fix data stores?
There might be a way to fix data stores as it is. For "pseudo-caches" like
GuildMemberRoleStore
, we could turn them into proper data stores as proposed. While this solves the first half of #1, everything else is still a problem. The only way to fix everything else is to separate.We talked a little bit in the Discord server already but putting it here would be better for discussion.