opensrp / opensrp-server

OpenSRP backend
https://smartregister.atlassian.net/wiki/display/Documentation/OpenSRP+Developer%27s+Guide
Other
24 stars 37 forks source link

Syncing requirements #72

Open mberg opened 9 years ago

mberg commented 9 years ago

When evaluating potential new approaches for syncing. Some of the requirements we will want to support.

mberg commented 9 years ago

Explore this:

http://developer.couchbase.com/mobile/develop/guides/sync-gateway/

mberg commented 9 years ago

https://cloudant.com/blog/introducing-cloudant-sync/#.VbcXceiqqko

mberg commented 9 years ago

https://github.com/cloudant/sync-android

ndisha commented 9 years ago

Some of the suggestions that Matt passed on today and the possible fixes discussed:

  1. @maimoonak suggested we use locationID to download data rather than using the anmID
  2. @mberg we sync the database details (just what changed) rather than the whole form submission. This means the ziggy will remain on the client side to map the form submissions to the form definitions, in summary we will:

-use ziggy for entity mapping relationship -we can explore merging the mother, child and ec tables (they have similar details) -we are exploring synching of databases -this will lead to changing the ziggy architecture on client for the syncing part

We can we have a history of activities that happened for this client, would these be like a report of sorts? Where do we view these, at the client side or the server side? We explore an approach where we use couch syncing to update a master table and use filters to figure out data the user has access to. Check the email that @koros sent and keep the conversation going.

@sohelsarder @raihan-mpower @euclidian @dimasciput @cagulas Can we put our concerns here?

sohelsarder commented 9 years ago
  1. Data should have download option by locationID, by anmID or by both that would add more download flexibilities.
  2. In concern to Ziggy architecture changes, So far my understanding is, ziggy is doing:

    1. Entity mapping (according to form definition.json )
      1. Maintaining Entity relationship (according to entity relationship json)

    which in other words generally dictating which part of submission going into what part of the database's which particular field. Basically ziggy at client side parsing and loading enketo collections into sqllite tables and maintaining form submission sync with opensrp server. And at sever side doing almost the same thing but this time it is parsing and loading form submission and into couchDB. So from very generic view it is simply doing entity class properties value mapping and data parsing that we can easily replace with java class instead of taking help from javascript which is sometimes difficult to understand and change for the developers. And it would be great if we could get rid of that mysterious part. But before start of that implementation we need to ensure the following things, at least a stable version of code at repository so that other project can fork over it and can continue its ongoing implementation before ziggy replacement done and replacement should be done in a way so that these forked version can easily converge with main stream without huge migration effort. We need to keep in mind the other priorities of the platform that we also need to implement like,

  3. Schedule data passing into openmrs (ensuring no data transmission loss between opensrp to openmrs).
  4. Robustness of opensrp to openmrs transmission error handling and error reporting so that opensrp developers can integrate openmrs with minimal understanding of openmrs internals can troubleshoot the error from opensrp application scope.
  5. Multimedia data transmission server part not ready yet.
  6. Enrich opensrp api module for easy integration with other external systems.
  7. Finding other smart alternatives of data syncing (i.e check the prospect of cloud sync)

so far these come to my mind others ( @maimoonak @raihan-mpower @euclidian @dimasciput @cagulas @mberg ) can raise their part as well i think we need a balance decision here so that it minimally impact its dependent project development milestone and other platform specific priorities as well.

maimoonak commented 9 years ago

@mberg @sohelsarder @ndisha @alihabib @raihan-mpower @dimasciput @djazayeri

\ Although the story below looks uselessly long and boring but please read it thoroughly and provide your valuable feedback because it is related to core and base of OpenSRP. **

The expected syncing requirements seem massive refactoring and complete change of process and data flow on client side and client developers can comment on this better. Since changes are massive these can not be done immediately and would need smooth transition that should not affect currently running app (or in worst case minimal impact). Below is my analysis (concerned with server side work) and suggestions on how to transit.

1 - Current Flow:

All data is submitted and synced as FormSubmission on server/mobile-client that is parsed into custom register entities via Ziggy.

Problem: Current server register module doesnot enroll the beneficiary and its details into a central master entity (with a static and standard predefined structure) that can be verified and queried from any internal or external app without detailed knowledge of custom entities, register architecture and form submission fields structure.

Solution:

2 - Current Flow:

Every device to server sync includes download of all non-existing FormSubmissions on device for that HealthWorker from server. If this is first login or new device it leads to unlimited download with each and every data that may never be needed on mobile-client

Problem: There are three different potential problems in this approach 1- Bulk data download from start of program to date with every minute detail 2- The data on download goes under form files loading and reading, JSON parsing, table creations and data push in multiple tables i.e. avoidable heavy IO operations. The approach is inherent into app and trying changing the process would lead to drastic changes in mobile-client code 3- HealthWorker has access to his/her beneficiaries only

Solution: 1- As @mberg suggested that the approach would be changed and in future mobile-client would sync data directly from server (sqlite to couch). This would be supported by the change suggested in point 1 where we would have all data as Client, and Event and Obs (standard, central unlikely to change model). This would allow mobile-client send a filtered request for data download. The filter could be

2- The change would lead to massive refactoring on mobile-client and @raihan-mpower and @dimasciput can better comment on it, but from server side all above filters would be available and mobile-client can then go into this transition smoothly. New FormSubmissions on mobile-client would submit data to server as it is rightnow. Server would create Client and Events out of this and any other fresh device can download data as filtered Client and Event or even FormSubmissions.

3- The problem would be solved by using the approach above as HealthWorker would be able to download data of his/her own choice using different filters.

This would also allow external apps to sync and query data by following a standard, simple and unlikely to change Documentation.

Suggested Change in point 2

The changes suggested in syncing are: direct syncing between databases (sqlite to couch) with some master entities (Client and Events n Obs) incase of existing data sync or incase of edited data sync. The FormSubmissions would only be used to submit new Forms.

Problem: 1- Massive change on mobile-client and server 2- Direct sync by passing application logic

3- The sync still included full syncing of Client and/or Event data

4- Inconsistency in FormSubmission and Client and Event and Obs incase of edits

Solution: 1- (@raihan-mpower and @dimasciput) any concerns or suggestions on mobile-client changes !?

2- Direct syncing of databases would lead to a lot of code wasted though. OpenMRS-connector would also need refactoring.

3- @mberg and team !!?

4- Incase of data edits these would be done and synced via Client and Event n Obs, instead of FormSubmissions.

Concerns and Answers by each point raised by team.
    - @raihan-mpower and @dimasciput : OpenSRP API has this flexibility but mobile-client 
team can suggest how to use that
    - It is part of system rightnow and would remain as it is i.e. repeated auto data sync if 
network is available. Client app can allow user to filter and configure from 
settings/preferences
    - Direct sync would also be configurable for sure (@mberg)
    - see points discussed above.
    - points discussed above solve the problem.
    - "similar to syncing a cohort" needs more explanation
    - Client and Event has audit data and whenever record is updated it could be sent to 
mobile-client dealing with those beneficiaries (by location/HealthWorker or anything else). 
Ex: System on syncing request mobile-client finds the updated Client/Event applicable for 
that HealthWorker from  last sync to date and sends it to calling device.
    - Important things to consider
        - How to edit data on mobile-client (rightnow its FormSubmission)
        - Sometimes editing should be another FormSubmission. i.e. Closing an 
ANC case is an event in the life of beneficiary. It should not be handled via edit.
        - A person`s marital status should never be edited directly (unless data entry 
was wrong). It is an important event in life of person that may require some other 
details and modifications. E.g. Women would be moved out from a household.
    - see points above

-use ziggy for entity mapping relationship

    - data saving is embedded into script and changing it would need refactoring 
Ziggy javascript file (although a mix of java and javascript code).

-we can explore merging the mother, child and ec tables (they have similar details)

    - Client and Event n Obs solves the problem.

-we are exploring synching of databases

    - see point 2 and its concerns to take into account

-this will lead to changing the ziggy architecture on client for the syncing part

    - if team working on syncing, share the proposed plan with us, we might be 
able to suggest or share our concerns better
    - I would suggest to use Client and Event n Obs and the audit data available to manage this.
 If there is any other thing team has in mind plz share with us. Also I would recommend to 
stick to a standard and single model for everything rather than using different types of 
models for different purposes. This also leaves potential inconsistencies in data.
    - Does Client and Event n Obs fits it?
    - addressed in point 1 and 2
    - Ziggy doesnot sync with server. It parses the FormSubmissions received from server into 
custom entities like ec, mother and child and saves both FormSubmissions and custom 
entities into sqlite. This is done by Ziggy but controlled by java classes 
(Ziggy service is passive).
    - On server it doesnot load and save FormSubmission rather it get FormSubmissions to be
 processed(controlled by Active java service) and parses into entities and then saves those 
in their corresponding tables.
    - If data sync is done sqlite-couch then we would get Ziggy removed from server, so 
rightnow the priority is to refactor and minimize its role on server rather than replicating 
the functionality.
    - Although I don`t feel like data replication for scheduling on OpenMRS is a good idea 
because scheduling generates daily alerts for HealthWorker and its state is maintained in 
two or three different entities (Action, scheduleTrackingApi internal domain model). We 
can think of pushing data to server when state goes into state Completed/Defaulted 
otherwise it would be back and forth updates on both sides. But still it depends on 
what reports we would ever need from scheduling API. Please donot discuss this point 
in this thread. Copy and paste it in different ticket and provide your comments/feedback
    - Multimedia data transmission: Is anyone working on this from your side?
    - Easy integration with other external systems with opensrp: Please list down 
in detail, the current problems, your requirements and suggestions in this regard 
in a separate ticket
    - For data syncing provide your feedback, suggestions and concerns on approach 
point 1 and 2
koros commented 9 years ago

@mberg @sohelsarder @ndisha @alihabib @raihan-mpower @dimasciput @djazayeri @maimoonak We can put this feature under R&D, as pointed out there are some considerations and challenges to be addressed, nonetheless I belief we should continue exploring this issue as well as other suggested options, we can always refer back to it in future if need be. It's also expected other obstacles could emerge down the line, for example the initial bootstrapping of the data if we are doing table to table synching the only table that exist on the server side at the moment is FormSubmissions table, however if in the end the pros outweigh the cons we can see how to resolve the challenges, e.g we could use a batch job to process and submit the synced data

ndisha commented 9 years ago

@maimoonak, thanks for this; @raihan-mpower @koros @dimasciput I have taken point 2 and added my comments on the form submission data optimization on the openrsp client side: https://github.com/OpenSRP/opensrp-client/issues/8;

julkarnain commented 9 years ago

@maimoonak @ndisha @sohelsarder I am explaining the understanding of ziggy in addition to Maimoona's concept. Ziggy converts the raw submission data to the simple and splited format depends on the bind type. Ziggy iterate the form submission data recursively. When it works with every bind type for form and sub-form both.

The proposed structure which are using for mapping data to OpenMRS its the concept of OpenMRS integration concept for relational mapping database. But, Maimoona this database concept is compitible with sql-based database system.We are using no-sql database so it is not so helpful and work load will not be reduced. To post and process data from submission data to the BaseEntity, Client, Event as well as to get data from these documents both will also extend the codebase.

First time Ziggy saves the data to the HH, ELCO, Mother, Child documents. When HH, ELCO, Mother, Child etc take the services by follow up forms the services data (encouters) are updated to the related Document like HH, ELCO, Mother etc.Suppose for Mother, ANC1, ANC2, ANC3, ANC4, tt1, tt2 etc. can be updated. For EC , fp strategies data can be updated. For Child, Immunizations data can be updated.So , When we need information of any particular registry we can get all history of the Beneficiary in a single query. Actually this is the facility of No-sql based database system. Huge data you can get without relationship mappings.

In the case of the proposed model we have to execute at least three/four/five and so on query with BaseEntity, Client, Event etc depending on the expecting data. If it would be a relational database system in that it would be very helpful, we could get the data using join query.

So, I am not saying to remove your concept but my suggestion is also do not remove ziggy you can apply your concept alongside ziggy. From where can you start I think you can start within route method where handlers are called.

If you think this ziggy library can not fulfill your requirement in that case I can say I will modify ziggy. Because I have experience in NodeJS, AngulaJS based application

koros commented 9 years ago

@mberg @sohelsarder @ndisha @alihabib @raihan-mpower @dimasciput @djazayeri @maimoonak @cagulas I have attached the initial document on this issue, i haven't answered/addressed all issues raised so far but i intend to do that soon https://docs.google.com/document/d/1bBx51sLO5vG7W8NKfDXvGvlqyMiXBXbNkxEMENIjues/edit?usp=sharing

koros commented 9 years ago

@mberg @sohelsarder @ndisha @alihabib @raihan-mpower @dimasciput @djazayeri @maimoonak @cagulas sorry for taking long to respond to the issues raised above, i have added a few more comments on the shared document, i'll continue refining the document as i also complete the proof of concept opensrp-cloudant sync to demonstrate how we can incorporate this technology into opensrp