alpheios-project / documentation

Alpheios Developer Documentation
0 stars 0 forks source link

CEDICT integration with ClientAdapters and Components #18

Closed kirlat closed 4 years ago

kirlat commented 4 years ago

I was thinking about the best architecture of the CEDICT service and it's integration into the application. Below are some ideas that came to my mind.

ClientAdapters communicate with remote lexical services. CEDICT is the service of the same kind, just run locally. So I think it would be natural if all communication to it will go through ClientAdapters.

ClientAdapters have an excellent flexible architecture. It allows to add new adapters easily. So, in order to support Chinese queries, all we need to do is to add a CEDICT adapter. We already have it, in fact.

The CEDICT service runs locally, within an iframe. We need an ability to control whether we want to run it or not (because some app configurations may not need Chinese data), and thus we shall be able to start or stop it, better doing it dynamically.

I think the best approach will be to associate a service with a data module in components. The module will know how to check if the service is running and how to start it or stop it. If app needs CEDICT functionality, it registers the module, if not, it omits that step. Starting a CEDICT service is an act of creating an iframe with the CEDICT content behind the scene. To stop the service an iframe needs to be destroyed.

To communicate with CEDICT service one has to post a cross-domain message. Messages are event based. We, however, can wrap them into a promise-based interface by adding a messaging service between the client and the CEDICT service in a way similar to the one we use with messages to background scripts. A CEDICT client adapter will call the messaging service and will receive a promise in return that will be either reseolved with a lexical data or rejected.

The whole architecture can look like below: image

Upon the activation of a webextension or an embed-lib UI controller it creates an instance of the Local Lexical Services Data Module. It triggers an initialization of CEDICT client services. If Chinese lexical data is requested an application will create a lexical query in the usual way. It will call client adapters then and the request will go to the CEDICT client service via the CEDICT client adapter and the messaging service.

If an architecture prove worthy we can move all functionality of client adapters into the aforementioned data module. Then requesters will be able to call methods of the data module directly instead of going through lexical queries and the UI controller. The data module will use ClientAdapters behind the scenes; the data module will fulfill the role of the wrapper around services provided by ClientAdapters. The big advantage of this approach is that it can simplify architecture significantly: the modules can be requested directly, without going through a UI controller as of now. Also, because the module has its own Vuex store, it can publish results into the store directly from where they can be monitored and accessed by all interested parties.

@balmas, @irina060981: how does all this look to you? Do you have any comments or suggestions?

balmas commented 4 years ago

I think this is very interesting. I don't fully understand what the architecture would look like once the client adapters are moved into the data module.

I was thinking today about another service that might benefit from the iframe/indexdb message passing architecture: the way we manage tree diagrams. Current we use a normal ajax service call to get the morphology data, and then separately loaded a viewer which has access to the same data (but from a different source) in the diagram panel in a iframe. With some changes to that javascript-based viewer app (arethusa) we could get the data for the lexical query from the same source that populates the app. In this case, the tbAdapter module of client-adapters could be changed to use the message passaging service rather than the http call.

So, I think an architecture which allows for client adapter data sources to be either remote ajax calls, or iframe message passing, could be very useful. Particularly if we might be able to decide based upon user preferences whether to use one or the other approach for the same source. For example, in the case where there is not enough local memory, such as on a mobile device, maybe we could get data via ajax calls, and on the desktop, use the iframe/indexeddb approach.

irina060981 commented 4 years ago

I didn't fully understand how client-adapters would be working via data-modules and Vuex. But am I right, that this way:

From my point of view (may be it is legacy of course) that different parts should be as much independent between each other as possible and should be as much independent from the platform as possible. ClientAdpaters for now has the following abilities:

It was the task to create it this way. It could be used without components repo at all (that's why it could be easily used inside any tests). It destroy its instance after execution (this way I was trying to avoid memory lag)

I think that we really have to create an architecture for the local services - chinese, inflections, and storing data to IndexedDB. Am I right that we don't have an ability to create it as a fully independent service with income/outcome by some protocol and connect it to ClientAdapter or any other requestor? And create an additional data-module - that could get any additional events from services (similiar to definitions) and publish them to components if they need it?

kirlat commented 4 years ago

I will provide some answers first and then will try to come up with the diagram of how the whole architecture will look like if ClientAdapters be used by data module.

it would be tightly connected with components and Vue.js? and wouldn't be available without them?

On my opinion, that will be bad and we shall avoid tight coupling of inner parts of ClientAdapters with modules, Vuex, or any other architecture-specific solutions. I think ClientAdapters should stay as abstract as they are now.

you suggest not to separate chinese service from the ClientAdapters? So remote service would be remote, but local service would be integrated with the adapter?

I think the Chinese service itself shall be separate from ClientAdapters the same way other services are. Let's take any remote service as an example. An adapter for this service knows the URL of the remote server that provides this service, or it receives such URL as a parameter. It also knows in what format to send a message to the service and what to expect back in response.

I think we can use the similar approach with local services. I suggest to have a promise based messaging service in front of a CEDICT local service. The messaging service object will be provided by the CEDICT service itself. So, in order to communicate with the CEDICT service, a CEDICT adapter in ClientAdapters must have a reference to its messaging service (it will probably receive one during instantiation as a config parameter so there will be no tight coupling there) and know in what format to send data to the CEDICT service and what will be sent back from it.

we will completely loose the ability to use ClientAdapters inside unit tests? (we are not using data modules directly in tests)

I think that we shall be able to test ClientAdapters independently the way we do now. Nothing will change for them. The only think new will be a CEDICT adapter that will talk to CEDICT messaging service. We can use a messaging service mock-up for testing or use the real one.

lexicalQuery has also to check the order, some settings and ability to execute of ClientAdapters methods - do you suggest to move it to ClientAdapters too?

I'm not sure about that. Maybe we can move that to the data module. Maybe we can keep using a Lexical query the way we are now. The data module, upon receiving a lexical request, can spawn a lexical query on its own. What do you think?

From my point of view (may be it is legacy of course) that different parts should be as much independent between each other as possible and should be as much independent from the platform as possible.

Totally agree to that. I think we'll be able to keep it this way.

ClientAdpaters for now has the following abilities: income - adapter name, adapter method, adapters params outcome - formatted result

I think we can keep it almost unchanged. For CEDICT we'll need to provide a reference to the CEDICT messaging service (that shall be the only thing different from the remote adapters). We can do that during an initialization of ClientAdapters. This reference is, from my point of view, somewhat similar to the URL of a remote service: it is an address (just an address in memory in this case) of a place which provides the service.

And create an additional data-module - that could get any additional events from services (similiar to definitions) and publish them to components if they need it?

We can do that if we would like to.

I will follow up with the diagram shortly.

kirlat commented 4 years ago

This is how a final architecture might, on my opinion, look like:

image

A lexical services data module serves as a wrapper of ClientAdapters. It can take a part of functionality that works with lexical data away from a UI Controller. This, on my opinion, will be a good thing because a UI controller is overloaded with functionality at the moment; splitting this up will make architecture more modular.

So, instead of calling getSelectedText() of the UI controller Vue components will call a method like that on the data module directly. The data module may also instantiate and interact with the word list controller (but that's a subject to debate, I think).

I think all Vue components will benefit from interacting with lexical services data module as it publishes results right into Vuex store. Other components may still interact with ClientAdapters directly, if they need to. So I would say that the lexical services data module is a way to expose requests provided by ClientAdapters conveniently into a Vue.js world.

What would you say about an approach like that?

irina060981 commented 4 years ago

I think that it sounds interesting.

I believe that you are talking mostly about

I think it is a good way to develop the project.

I could see here only the one problem - RAM usage. I was testing chinese with webextension today - 3 tabs from different sources with extension + localhost with embedlib http://www.cbeta.org/node/5562 https://ctext.org/han-shi-wai-zhuan/juan-ba https://en.wikibooks.org/wiki/Chinese_(Mandarin)/Lesson_7

altogether (for Google Chrome) it adds 600Mb RAM to existed and it becomes 1.1Gb then I looked up a chinese word - and it rased to 1.5Gb and in parallel I was rebuilding components - it took 2Gb RAM (that was mostly surprising for me)

finally how I was able to see it - it crashes my tracker with the message - there are too less system resources :)

balmas commented 4 years ago

thank you for the revised diagram, this helps alot.

A few thoughts:

1) I think it would be good if we can implement the async messaging service (the one that wraps the post/receive messages from the local service as promises) as an abstract service, rather than making it specific to the CEDICT service.

2) On the question of where the business logic for interacting with the client adapters for a lexical query should lie (@irina060981 's question: "lexicalQuery has also to check the order, some settings and ability to execute of ClientAdapters methods - do you suggest to move it to ClientAdapters too?") -- ultimately I think this probably belongs somehow in the language model and able to be controlled by application and user preferences. Different languages and resources impose different requirements here. For example, right now, for Persian, we prefer the results from the lexicon over the morphology service, so we skip the first query to the morphology service and loop back to it if we don't get results from the lexicon service. For Latin and Greek, we may want to add in another service which is able to parse proper names, etc.

3) Regarding RAM usage -- I do think we need to keep an eye on this. As I mentioned in my previous comment, it might be helpful to be able to switch between local and remote data services according to the capabilities and preferences of the client environment.

On the whole, I think this architecture is going in the right direction, just would like to keep an eye on these points as we move forward. Thanks!

kirlat commented 4 years ago

I think it would be good if we can implement the async messaging service (the one that wraps the post/receive messages from the local service as promises) as an abstract service, rather than making it specific to the CEDICT service.

We have such messaging service implemented for communication with background scripts, and we can take it as a base. Agree that we shall make it generic.

On the question of where the business logic for interacting with the client adapters for a lexical query should lie (@irina060981 's question: "lexicalQuery has also to check the order, some settings and ability to execute of ClientAdapters methods - do you suggest to move it to ClientAdapters too?") -- ultimately I think this probably belongs somehow in the language model and able to be controlled by application and user preferences. Different languages and resources impose different requirements here. For example, right now, for Persian, we prefer the results from the lexicon over the morphology service, so we skip the first query to the morphology service and loop back to it if we don't get results from the lexicon service. For Latin and Greek, we may want to add in another service which is able to parse proper names, etc.

Agree that the best place for such logic is within the language model. I am just not sure if the language model shall call methods on client adapters directly: that will create a tight coupling between them. But that might not be necessarily a bad thing, especially if we remove such coupling form the components. Need to think about it a little.

Regarding RAM usage -- I do think we need to keep an eye on this. As I mentioned in my previous comment, it might be helpful to be able to switch between local and remote data services according to the capabilities and preferences of the client environment.

Agree that we shall monitor it closely. Also agree that we shall provide flexible approaches to where data shall be stored and from where it shall be retrieved. Thus we can optimize for speed, or RAM usage, or whatever will be required.

balmas commented 4 years ago

I am just not sure if the language model shall call methods on client adapters directly:

I would prefer that we come up with something that allows the language model to create something like a set of instructions that could get passed to the client adapter via the lexical data module, rather than the language module interfacing with client adapters directly.

kirlat commented 4 years ago

To reduce risks, I think we could take a staged approach to that. We can implement it in the following steps:

  1. Implement the CEDICT language service within an iframe and a messaging service for it.
  2. Add a lexical services data module that will just start and stop the CEDICT language service (for now).
  3. Update the CEDICT adapter to work with the local CEDICT language service. Switch to using the local CEDICT language service.
  4. Move all current lexical data related functionality from a UI controller to the data module.
  5. Refactor the business logic of the lexical data and move this logic into the language model.

Would a plan like that be good?

kirlat commented 4 years ago

I would prefer that we come up with something like the language model creates something like a set of instructions that could get passed to the client adapter via the lexical data module, rather than the language module interfacing with client adapters directly.

That would be an ideal solution. I don't know if we'll be able to come up with something flexible and not crazy complex at the same time but that's a good area to put our mental efforts into, on my opinion.

balmas commented 4 years ago

To reduce risks, I think we could take a staged approach to that. We can implement it in the following steps:

  1. Implement the CEDICT language service within an iframe and a messaging service for it.
  2. Add a lexical services data module that will just start and stop the CEDICT language service (for now).
  3. Update the CEDICT adapter to work with the local CEDICT language service. Switch to using the local CEDICT language service.
  4. Move all current lexical data related functionality from a UI controller to the data module.
  5. Refactor the business logic of the lexical data and move this logic into the language model.

Would a plan like that be good?

that sounds good to me

balmas commented 4 years ago

implemented in release 3.3.0