Closed balmas closed 4 years ago
@kirlat and @irina060981 please take a look at this and let me know your thoughts and questions. There are a fair number of assumptions in here about the different pieces of the architecture for working with user data and authenticated services, etc which likely also need some discussion.
Hello, Bridget and Kirill! I have several questions and thoughts on the described workflow:
1) Where should be placed WordListComponent? It could be placed inside components repo and could be placed in external library (similiar as it is done for inflection games). If separated - it could be united in one repo with WordListController and work with the help of subscribtion to different events and publishing different events.
2) Would userDataQuery be a part of UIController or it will be a separate library with its own UserDataController? If it will bge separated - it could be easily used in any other repo without the need to import components repo.
3) And the same question about DBSyncController.
4) About synchronizing data local IndexedDB and remote server - I have the following thoughts. From the IndexedDB documentation adheres to a same-origin policy and is own for each browser. (https://developer.mozilla.org/en-US/docs/Web/API/IndexedDB_API/Basic_Concepts_Behind_IndexedDB)
So we will have several instances of IndexedDB for each page (text) it is used, for each browser, for each environment. And the only thing that could connect all of them - user identification data. Also as IndexedDB has storage limits - we should consider remote server's source as a higher priority. Also the same text could be opened in different tabs of the same browser and IndexedDB would have to queue requests from different tabs too.
And from the other hand we should use advantages of having saved locally data. So I suggest to use 2 ways of updating data on a screen (chosen by user) -
5) About protected ClientAdapter, I could be wrong here , but may be it is enough to add additional fetch variant with additional security keys to ClientAdapters library?
6) And about RemoteDataStore - what it would be? As IndexedDB is object-like database and has no SQL - may be it is better to have some NOSQL solution on remote server - like MongoDB?
Thanks for laying this down! There are several very important decisions we have to make that, I believe, will define how successful our development will be in the future. Because of this, I would like to offer a discussion of several architectural issues that we face. Our architecture will depend on what decisions would we make on these (and other) issues.
A. Working with both webextension and Safari app extension code I've got a good sense of how difficult and time consuming it might be to support two codebases that do pretty much the same thing, but use different technology stacks, even if the difference is not so significant (a different background code). So I think our goal would be to minimize re-implementation of similar code. Since we are using different architectural solutions (embedded lib is a client side code), webextension (client side code in isolated environment and a protected background code), Safari app extension (same as webbextension but instead of background we have an app extension written in Swift), PWA (similar to webextension in a way). We should try to have one piece of authentication/authorizaton code that will work for all clients (if possible, because there are some challenges here). JS seems to be best for this as it's a common denominator for all our clients.
B. If we want to be successful on mobile, we need to think of how to minimize data throughput. Mobile data is still slow, I believe, even in developed countries. So we should: B1. Minimize the number of outside requests by combining several into one, if possible. This allows to avoid concurrent request limit (https://stackoverflow.com/questions/7456325/get-number-of-concurrent-requests-by-browser). Will not be an issue with SPDY/HTTP 2.0, but still worth considering. B2. We should minimize an amount of data passed over the network. If we issue a request to a resource that produces a large response and then use just 10% of info returned, we will be slowing us down a lot. Ideally, we should allow user to specify in request what information does he or she needs exactly and receive only that data back (a la GraphQL or REST with parameters). B3. Once we start to record user data that will multiply an amount of information we store and transfer tremendously. We should use compact data representations whenever possible, maybe in a form of gzip or BJSON. B4. Amount of user data stored can become huge over time. To retrieve it all on the first sync can be very slow. We should split it into chinks, retrieving the most recent data first and showing it to the client, and then obtaining missing chunks quietly in the background. We should consider mechanisms for splitting and combining data in our apps and doing partial data updates.
C. Different security environments. Webextension, Safari app extension, and PWA can considered be trusted (in a way), but embedded lib is not. The library is running in the same environment with other scripts loaded by the page, and any malicious script (loaded by the page unintentionally or injected into it) can get access to all data of the embedded library (please correct me if I'm wrong here). This means we cannot trust embedded lb to store any secrets. This also means a different security architecture for the library and the rest of the apps. Unfortunately, this contradicts (A). Should really think about possible solutions here.
Considering all that it would be tempting to shift some of our current logic to the servers side. Now we, during a lexical query request, retrieve lemma from one source, lemma translations from the other, and then execute several definitions requests. If we shift this to the server we could:
Of course, there are several obvious drawbacks to that solution.
I'm not sure what would be the best solution, but it's very tempting, I think, to move business logic somewhere where it can be implemented once and work for (be shared by) all clients (preferably in a protected environment) and where it also can be updated without the need to update each client implementation. Maybe there are other that server-side solutions that could help us to implement this? Maybe we can use service worker to host all our business and authorization logic? It seems to be supported by a both Chrome, FF and Safari in a different extent. It's JS and we can share it between all clients. But I am not 100% sure what limitations we would have there.
What are your thoughts on this? I think we should define our general strategy before going into details. Thanks!
I agree with @irina060981 that we could probably add authorization token as a parameter of client adapter queries (or add authorization info to queries in some other way). That should be relatively simple. If we want to have it optional, we can probably use a mixin for authorization logic.
Regarding IndexedDB and same origin policy we could probably put the database into a service worker or a background script (if service worker functionality would be adequate for us then it's the best because we can us it in Safari, it seems). So in a content script or an embedded lib, when we need data, we send a message (DOM event probably as a universal solution) to the SW (i.e. service worker), SW will request IndexedDB and remote server if necessary, and respond with data in a response message. This way all data across different tabs will be shared and in sync.
I'm for moving out as much of business logic from a UI Controller as possible. I think the role of a UI Controller should be to coordinate UI elements only and provide their interactions. A business logic should be somewhere else. I like the concept of Queries, so we can probably use data queries the same way we use lexical queries. If that be not enough, we can introduce a specialized data controller.
Service workers could be ideal to provide a first level of caching for the apps. Can't find any clear info about whether we can use them in webextensions or not. But it also seems to be no info about it being prohibited either ...
Regarding IndexedDB and same origin policy we could probably put the database into a service worker or a background script (if service worker functionality would be adequate for us then it's the best because we can us it in Safari, it seems). So in a content script or an embedded lib, when we need data, we send a message (DOM event probably as a universal solution) to the SW (i.e. service worker), SW will request IndexedDB and remote server if necessary, and respond with data i
Yes, I think we would be able to work with the indexedDb in the background script, so that all content stored by the background script is in the same db, regardless of what page the user was on when they are using the extension. And for the PWA we would use a service worker. (We may be able to use service workers in webextensions but it's not really clear to me. I do know that google has stated of the goals for the the webextension manifest v3 as "Modernizing to align with new web capabilities, such as supporting Service Workers as a new type of background process"
I need to think about the authentication issues with the embedded library vs the PWA. As I articulated in the release scope comments in Slack, for user data storage I am leaning towards using AWS Serverless Stack, which includes AWS API Gateway, AWS Lambda and either AWS DynamoDB or S3 or both. We would use OAuth2 and AuthO's API authorization flows with JWT to protect access to the AWS API gateway for user data storage/retrieval.
I believe I want to to stick with a microservices approach and client-side authentication. Some links that might be helpful here:
https://yos.io/2017/09/03/serverless-authentication-with-jwt/ https://auth0.com/blog/building-serverless-apps-with-aws-lambda/ https://blog.codecentric.de/en/2018/04/aws-lambda-authorizer/ https://serverless.com/blog/strategies-implementing-user-authentication-serverless-applications/ https://medium.com/@gauravve/service-to-service-authentication-using-auth0-and-serverless-framework-825c45852dbe
Where should be placed WordListComponent? It could be placed inside components repo and could be placed in external library (similiar as it is done for inflection games). If separated - it could be united in one repo with WordListController and work with the help of subscribtion to different events and publishing different events.
I was thinking that the WordListController and WordListComponent (as well as a WordListItem Component) would go in the components repository. WordList and WordListItem would be data model objects in the data-models repository. We could start development with them in a separate repository, but per our refactoring goals, we are trying to reduce the number of dependencies. Plus I think for any Alpheios application the wordlist is a core component.
Would userDataQuery be a part of UIController or it will be a separate library with its own UserDataController? If it will bge separated - it could be easily used in any other repo without the need to import components repo. And the same question about DBSyncController.
Certainly UserDataQuery and DBSyncController are separate from UIController. Whether they belongs in core components is a little less clear to me. It depends in part, I think, on whether we can make this functionality available to the embedded library in a secure way or not.
The questions about combining server requests and optimizing data syncing all require a little more thought. Will try to respond further on these soon.
Hello, Bridget and Kirill! My thoughts:
1) I am not very experienced with Service Workers, but as I know they could be used only inside https environment and Safari doesn't support Service Workers yet ( source ) So it seems to me that it couldn't be use in webextension (as pages could be both http/https) and the same is for embedded-lib. But for PWA (not Safari) it could be easily used. So may be using Service Workers is not yet a solution that allows to reduce similiar code for various platforms.
2) About mobile support - I think that there is one more complex problem here (than huge network traffic and big caching data amount). Mobile browsers have less supported possibilities and there are much more variations for them. And it seems to me that the amount of efforts of creating client-server solution could be similiar to creating mobile applications for working with it.
Client side solution (as we have now) has some advantages for desktop usage: 1) it is less dependent on simultaneous number of users (because calculations are made in browser and not server) 2) it has an opportunity to have offline mode
May be it could be useful to create a light version for mobile (with special sign - for mobile) - because if someone tries to use it with poor connection - he could choose to get only morph-data (for example) and use normaly?
If to be honest in my practice I had much experience with classic client-server architecture (like Kirill suggested) and thought that it is the only good way. And I had experience with problems with server overload and constant growth of upgrade costs. And first when I saw this client-side implementation I was surprised. And now I could see advantages of this approach. Because it seems to me that if Alpheios Extension would be used in study process and for example a whole class starts to use it at the same time - it won't be very easy to server. But may be I have bad experience with poor servers :)
Irina, agree with everything you said!
There is no ideal solution here, and every approach will probably have it's advantages and drawbacks. I like client-based architecture better for what we do (as I understand you are 🙂), but I see some potential issues with it that we may face later. So I thought if we discuss it now, we can probably find some approaches to make it more bearable. Even if there is no solution, we would still keep those issues in mind while writing our code, and it will help us to create a better one, I believe. Once we fully aware of the problems we can try to minimize their consequences.
With what I've learned so far we would probably still have to manage at least three versions of an authentication code (webextension+PWA/Safari/embedded lib) (sigh 😞)
I love our discussions and consider them to be a very important part of our workflow :) About authentcation process - I have not so much experience here with web-extension.
But I think that approach with JWT tokens (thank you, Bridget, for links - I have used tokens before but hadn't ever read such a clear description as in the first link) could be very helpful.
About using security issues here - there is not very "fresh" article about security questions. But it could be helpful - it suggests to use Chrome Identity API. If I understood correctly the basics of this technology, it could be helpful both in web-extension (not sure for Safari) and embedded-lib.
From the article:
Chrome API provides a chrome.identity service, which provides a secure way for an extension to authenticate, fetch and refresh tokens. This API enables a user to perform authentication against a third-party service. Chrome can interactively display a popup UI, which:
Can store cookies and session information Is protected against any script injection, even by other Chrome extensions Each Chrome extension has its own chrome.identity instance, and is only accessible by the Chrome extension owning that instance. This makes the token private, even from other malicious Chrome extensions.
Do you have experience with it? If it works well - it seems to me that we could create a library for authentication workflow that won't be very different for (webextension+PWA/Safari/embedded lib) What do you think?
I am using chrome.identity
API in the prototype of authentication functionality. It's working well so far, and it's probably the only browser native API solution available. It has a browser
namespace counterpart which I hope will work in FF well.
But Identity API works only in background-related pages, not in the client-side scripts. So it's not an option for an embedded lib (it has to use a different authentication workflow anyway, more on that later). And for Safari it's a no go too 🙁.
However, there is more to it: encryption libraries that we use to generate items of our requests (like random byte array generators and hash functions). Those libraries tend to be environment-specific too 😢.
Some refs: Since we'll be using Auth0, here are some pieces of Auth0 documentation:
For webextension (all browsers) and PWA we'll probably use what is called "Authorization Code Grant Flow with PKCE": https://auth0.com/docs/api-auth/grant/authorization-code-pkce
For embedded lib (since it cannot be trusted and we cannot store secrets in a client-side script) the best choice is "Implicit Grant Flow": https://auth0.com/docs/api-auth/grant/implicit
And the authentication/authorization code in Safari has probably be within the app extension, which means a different codebase.
Thank you, Kirill, for explanations! I think it is a javascript-world and all its advantages and disadvantages :-)
For MacOS application - I think it needs this https://github.com/auth0/Auth0.swift
And it is a new challenge for making Safari App Extension a next state of the art, I think. 🙂
Some additional thoughts based upon our discussion at today's check-in:
Whether or not it ends up being possible to avoid cross-domain restrictions on indexed db for the webextension, for the embedded library and reader applications we know we will have cross-domain restrictions. So the design has to take that into account.
Since we want to support a single user account across multiple applications (webextension, mobile reader, etc) the remote user data store is the location which be the authoritative source of the user data.
The IndexedDb can be used as a local cache to support fast and offline access but it will always need to be updated from the remote user data store in order to provide a fully up-to-date view of the user's data.
We must have an API that protects us from the need to duplicating the business logic around retrieving remote data and merging it with the local indexeddb. Any client side feature, such as a word list, should not need to know the details of where the data is coming from. This is the point of the DBSyncController in the above proposed design.
While we can store entire complete Homonym (or other alpheios data-model) objects in the user data stores (both remote and local) and may decide to do so in some cases for performance reasons or to support offline access, the main purpose of the user data store is to store information that is unique to an individual user's experience with the Alpheios applications. We probably do not want to be duplicating data that comes from our remote services across each and every user data store, of which, in the case of the local indexeddb, there could be multiple for each domain the user visits.
Storing the data in structures that can be directly serialized to/from the Alpheios Data Model objects is appealing but if we do this we need to have a way to easily identify the state of that data model object and whether or not it can or needs to be filled in with data from remote services.
It might also be that the persistent structure of a user data object is a subset of what is stored in the local indexed db. The DBSyncController might be responsible for deciding which properties of an Alpheios Data Model object to populate from where.
The DBSyncController could then also implement ClientAdapter interfaces so that it can be used as a source for LexicalQuery data.
For example, I could see a scenario like the following:
With a fresh start:
WordListController requests WordList from DbSyncController DbSyncController retrieves WordList from RemoteDb DbSyncController stores WordList to LocalDb
At his point, the WordList data in the LocalDB is identical to that in the RemoteDB
Then, the user clicks on a Word on the Wordlist, and the UI initiates a LexicalQuery
LexicalQuery asks DbSyncController for the word DBSyncController finds the word in the LocalDB WordList data and returns the Homonym LexicalQuery checks the Homonym isComplete flag. It is false so LexicalQuery continues to proceed with the query as normal
Upon completion of the LexicalQuery
WordListController updates the WordListItem with the full Homonym WordListController calls DBSyncController.updateData(WordListItem) DBSyncController sends WordListItem with partial Homonym to RemoteDB DBSyncController sends WordListItem with full Homonym to LocalDB
Later user clicks on the a word which is in the WordList and which already has a full Homonym stored for it in the local LocalDB, and the UI initiates a LexicalQuery
LexicalQuery asks DbSyncController for the word DBSyncController finds the word in the LocalDB WordList data and returns the Homonym LexicalQuery checks the Homonym isComplete flag. It is true so LexicalQuery issues the various events to indicate that the Homonym is available.
Although the need to support versioning of service results is probably a lower priority, we could add additional business logic into the both the DBSyncController and the LeixcalQuery to check version flags on a Homonym's component parts against service output to find out of the local store needs to be updated. But if the local indexeddb is understood to be a temporary, incomplete storage, and the RemoteDB doesn't store full Homonym data, then this is maybe less of a concern.
@kirlat and @irina060981 does this make sense to both of you? What potential pitfalls do you see in it?
as an alternative/addendum to this statement:
The DBSyncController might be responsible for deciding which properties of an Alpheios Data Model object to populate from where.
I could see the code getting messy if the DBSyncController has to know too much about the individual data model objects. So an alternative might be to have objects which are candidates for remote storage implement a toPersistentJSON method, or the like, which could be used to create the minimal version for remote storage, and set an inComplete flag on those components of it which are not full representations.
Another thing to think about:
All user data objects should probably be versioned themselves, so that we can deal gracefully with future data structure changes. E..g. so that if need be, we can quickly differentiate between a WordListItem version 1.0 and WordListItem version 2.0 without having to examine the datastructure.
All user data objects should probably be versioned themselves, so that we can deal gracefully with future data structure changes. E..g. so that if need be, we can quickly differentiate between a WordListItem version 1.0 and WordListItem version 2.0 without having to examine the datastructure.
👍 for data versioning. It might also be beneficial to version the REST API of remote services. If we use GraphQL we wont need this as they suggest introducing new fields as a preferred way of versioning: https://graphql.org/learn/best-practices/#versioning.
For versioning of a JS objects such as WordListItem it would probably be cleaner to integrate version info into their class names (i.e. have a separate class for each new version) rather than having a version field inside a class and some conditional logic in methods that will rely on the version filed value. The latter can become convoluted easily. What do you think?
For comparing data objects, there is a hash-object
library (https://github.com/puleos/object-hash) that computes hashes for JS objects. I have not used it personally, but maybe this approach could have some benefits in some situations.
I have some thoughts here too.
We have different data to arrange locally and remotely:
1) homonym data (got from different remote requests, user couldn't change it) 2) context and user data applied to the word (it is created locally - when user assign/remove important flag, some notes and selects the word from some context, that should be saved) 3) usage examples of the words (the same as previous - some context for the word, but it is uploaded from remote and couldn't be changed). 4) user data for authentication - userID (it is uploaded from remote)
We have several storages that should be synchronized somehow: 1) Remote User Data Storage (it should store the whole data for the user's wordlist among webextension, embed-lib, pwa and something else) - if I understood Bridget here correctly 2) Local IndexedDB (it stores some part of the data or full data, according to user's preferences) 3) Vuex storage (it stores only the data that should be placed in UI) 4) UI data (the data that is visible in UI in fact) 5) Remote services (morph, lexical, usage examples, translations, authentication and may be something else)
All 5 items could change data in first 4 items (according to previous list)
I think that we need here some central data Controller (maybe DBSync or maybe simply DataSync), that will have some rules to sync data using the following conditions: 1) user preferences 2) online/offline mode 3) data part status (correct or outdated) 4) mobile/desktop mode
And it should have access to Remote UserDatabase API, to IndexedDB methods, to Vuex data update events - in both ways - write/read
And it should be able to imported to content part or to background part.
And we couldn't define obvious priority for remote or local data, because some data has source remotely, some part of the data has source locally.
I agree, such controller could became really long codded file/environment.
And I think such DataController should be used inside LexicalQiuery
Similiar to Bridget's proposal:
Each updated part of the data - creates/updates wordItem instance inside wordList (with current data of context for the current Lexical request) instance and it uploads to Vuex (from Vuex to UI components) and resaved with updated data to local and to remote. Or may be it returns data to Lexical request and it send it to Vuex/UIController and WordListController
And on first page load we need to upload current wordlist (after authorization) It could the following workflow: 1) Sends request to DataSync 2) it checks Remote User DB (for example by last-update dt) and compares to local IndexedDB 3) Uploads data from Local if they are the same, or remotely (if not) 4) If uploaded from remotely, than requests merging with local
And when a user changes the data (place important flag for example or adds new context usage) or delete 1) Sends request to DataSync 2) It requests changes to Remote and to Local
Also user could remove some worditem similiar to previous scenario
I think there could be very different scenarios that could be implemented one by one .
I think we should divide all sync procedures by type of the data (similiar to Lexical request): 1) user data 2) morph 3) lexical 4) translation 5) usage example 6) context data 7) additional data - important, session flag, some notices
And create sync rules for each one according to
user preferences online/offline mode data part status (correct or outdated) mobile/desktop mode
inside some Controller
For versioning of a JS objects such as WordListItem it would probably be cleaner to integrate version info into their class names (i.e. have a separate class for each new version) rather than having a version field inside a class and some conditional logic in methods that will rely on the version filed value. The latter can become convoluted easily. What do you think?
I am not sure how I feel about that. I guess another option here is to use Protocol Buffers (https://codeclimate.com/blog/choose-protocol-buffers/). It sounds like the solution they provide to data versioning issues in service interactions is similar to that of the GraphQL approach - I i.e by relying on ionly adding not removing or changing fields. As this came up for me while thinking about the exchange of data to/from the CRUD microservice for the remoteDb for the wordlists, it seems that maybe that is the problem Protocol Buffers were designed to address. Do either of you have experience with them?
I have some thoughts here too.
We have different data to arrange locally and remotely:
- homonym data (got from different remote requests, user couldn't change it)
- context and user data applied to the word (it is created locally - when user assign/remove important flag, some notes and selects the word from some context, that should be saved)
- usage examples of the words (the same as previous - some context for the word, but it is uploaded from remote and couldn't be changed).
- user data for authentication - userID (it is uploaded from remote)
We have several storages that should be synchronized somehow:
- Remote User Data Storage (it should store the whole data for the user's wordlist among webextension, embed-lib, pwa and something else) - if I understood Bridget here correctly
- Local IndexedDB (it stores some part of the data or full data, according to user's preferences)
- Vuex storage (it stores only the data that should be placed in UI)
- UI data (the data that is visible in UI in fact)
- Remote services (morph, lexical, usage examples, translations, authentication and may be something else)
All 5 items could change data in first 4 items (according to previous list)
I think that we need here some central data Controller (maybe DBSync or maybe simply DataSync), that will have some rules to sync data using the following conditions:
- user preferences
- online/offline mode
- data part status (correct or outdated)
- mobile/desktop mode
And it should have access to Remote UserDatabase API, to IndexedDB methods, to Vuex data update events - in both ways - write/read
And it should be able to imported to content part or to background part.
And we couldn't define obvious priority for remote or local data, because some data has source remotely, some part of the data has source locally.
I agree, such controller could became really long codded file/environment.
There are some very good points here, and I think we need to be careful about the scope of this data controller, and limit it to persistent data accesses and not involve it in application state data. If we assume that for all persistent storage (including the local indexed db solution in "persistent" even if that is debatable) we will require user authentication, then we could call it UserDataSyncController or something like that.
The the point about being able to be imported to content or background, I will copy what I just put in the slack discussion here:
For the different interface to IndexedDb in the Webextension and the EmbedLib, ideally I think this should work similar to that which we have already discussed needing for the Auth object. I.e. we need an abstraction that allows the rest of the application to not care if this is happening in the background or the content side, and then an implementation of that abstraction that gets handed to the UIController's constructor
I am not sure how I feel about that. I guess another option here is to use Protocol Buffers (https://codeclimate.com/blog/choose-protocol-buffers/). It sounds like the solution they provide to data versioning issues in service interactions is similar to that of the GraphQL approach - I i.e by relying on ionly adding not removing or changing fields. As this came up for me while thinking about the exchange of data to/from the CRUD microservice for the remoteDb for the wordlists, it seems that maybe that is the problem Protocol Buffers were designed to address. Do either of you have experience with them?
I have not worked with Protobuf, but heard good things about them. I think they should be nearly ideal for inter-service communications.
I probably misunderstood your point about WordListItem versioning. I was thinking you was talking about versioning it for using within an application (i.e. that we might have some modules/components that were using both V1 and V2 of it at the same time), not about transferring it over the network. 🙂
I think protobuf might be beneficial for storing data too, in some situations.
There are some very good points here, and I think we need to be careful about the scope of this data controller, and limit it to persistent data accesses and not involve it in application state data. If we assume that for all persistent storage (including the local indexed db solution in "persistent" even if that is debatable) we will require user authentication, then we could call it UserDataSyncController or something like that.
It is important, on my opinion, that we won't end up with a huge do-it-all data controller as it might grow into something that is hard to maintain. To avoid this we probably should:
I probably misunderstood your point about WordListItem versioning. I was thinking you was talking about versioning it for using within an application (i.e. that we might have some modules/components that were using both V1 and V2 of it at the same time), not about transferring it over the network.
Ah yes, sorry I wasn't clear about that. I don't think that a single version of the application should be actively trying to save multiple versions at the same time, but it might need to be able to read older versions. That is, a newer version of the application shouldn't break if it encounters data that was saved by an older version.
It is important, on my opinion, that we won't end up with a huge do-it-all data controller as it might grow into something that is hard to maintain. To avoid this we probably should:
- Clearly define responsibilities of data controller(s) and be vigilant not to expand beyond those boundaries.
- If we end up with those responsibilities having a wide span, we should separate the controller into several modules. For example, we could have a persistence module that will be responsible for storing data, and it could have IndexDB and remote storage sub-modules. Merging data can have some non-trivial logic and can probably be separated into its own module too. So the whole thing would be a combination of small and specialized modules. Such modules are easier to upgrade and test.
Agree with these points.
this was implemented in the 3.0 release. Future work on user data management will be discussed separately.
The following is a proposal for the application architecture design for managing user data The need is to have a way to work with data sources efficiently locally, while keeping data in sync across multiple application instances.
The requirements for the user word-in-context lists are used as the example use case here, but the idea is to develop an architecture which is flexible enough to handle various data types and data sources, and which works across applications (Webextension, Embedded Library, etc)
For example, a user might do lookups on both a mobile device and on the desktop and each should be updating the user's wordlist to add the words as they are looked up. Similar requirements will be in place for user preferences and other sorts of user data.
I'm proposing a design which uses:
In the above diagram, some of the steps are represented as synchronous when they will need to be asynchronous but the basic flow is this:
[001] - [002] Upon application initialize, controllers subscribe to events which interact with data [003] - User requests a word list display by clicking a button on the word list tab [004] - Wordlist Vue component requests Wordlist data from the UIController [004] - UI Controller delegates the request to the WordListController [005] - Wordlist Controller requests data from a UserDataQuery object [006] - UserDataQuery object requests data from the DBSyncController [008] - [020] DbSyncController interacts with remote and local data sources to retrieve and merge data (In there, the assumption is that we might have a ProtectedClientAdapter which knows how to interact with data sources which require authentication. Exact details of that still need to be worked out but the idea is to isolate the business logic around authentication/authorization from that of managing and merging data sources -- in other scenarios the DbSyncAdapter could use the regular ClientAdapter to retrieve data from non-protected sources) [021] DbSyncController returns the fully merged data set to the WordListController [022-023] WordListController instantiates the WordList data model objects and supplies them to the UIController [024] UIController updates the data sent to the WordList view
The WordList can also be updated by events which are not specific requests to the WordList view component. For example, the requirements call for for every word being looked up to be added to the user's word list. In [001] The WordListController subscribes to the MORPH_DATA_READY event which happens upon word lookup. The UIController might also subscribe to a WORDLIST_DATA_READY event which happens whenever WordListData is updated.
[026] User initiates a word lookup [027] UIController requests data from the LexicalQuery [028] LexicalQuery publishes its MORPH_DATA_READY event [029] WordListController receives the MORPH_DATA_READY event, updates the WordList data model object and then initiates a request to the DBSyncController to store the updated data. [030] - [043] The DBSyncController interacts with the remote and local data stores to update the data (In reality the update events would probably be asynchronous but they are shown synchronously in the diagram) [044] WordListController publishes a [WORDLIST_DATA_UPDATE] event [045] UIController receives the [WORDLIST_DATA_UPDATE] event and updates the WordList view accordingly so that when the user accesses it next it is up to date
The DBSyncController could implement different approaches to synchronizing with the the remote data store depending upon where the code is running. If in a PWA, for example, it could use ServiceWorkers and BackgroundSync to queue up requests when the user is offline, options which are not currently available to the Webextension.