amaster507 / ifbmt

Independent Fundamental Baptist Missionary Trails
GNU General Public License v3.0
5 stars 0 forks source link

Technical: Database Design #29

Closed amaster507 closed 6 years ago

amaster507 commented 7 years ago

This is more of a technical programming question than most of the others up to this point.

In #11 it was decided to support recurring events. This support for recurrence should follow standard protocol defined in RFC5545 3.8.5. What needs to be discussed here is the best method to store and retrieve recurrance information in regards to ifbmt and its database. This needs to be normalized to be quick for the user and not create excessive work for the server slowing down other users.

There are two parts to recurring events, these being 1) The saved data and 2) The logic. The logic is the application's code taking the data and actually plotting the recurrence events.

Now here is the problem: "How to save the data?" How the data is saved will determine the need for complexity of the logic. There are two main thoughts of how recurring data should be saved:

  1. EVERY OCCURANCE: Save every occurrence as a separate row in the database

    Pros: Easy to query and display on a calendar. Easy to modify or cancel a specific occurrence of an event. Cons: Large storage space is required. There has to be some form of limitation on never ending events (usually 2038). Creation of new events is really expensive on the server and updating the event in bulk is very limited. Exporting the recurring event to ICS format does not follow standard protocol.

  2. RULES: Save each event a single time and then save the rules for recurrence and exceptions to the rule.

    Pros: Easy to add exception rules to cancel or modify single or future occurrences. Creating a reoccurring event does not put any over bearing load on the server. Cons: Hard to query and display busy and available dates. Displaying events is really expensive load on the server. A calendar with many recurring events would take a very long time to load.

My Idea: Mix both thoughts into a unique integration.

Pros: Easy to update occurrence rules and delete all future occurrences that were needed and recreate occurences as needed. Calendar loads after first request for the date range will be speedily available. Expense on the server is balanced out between event creation and initial event display. Exported events follow iCalendar protocol. Cons: More logic is needed.

benkennedy13 commented 7 years ago

Anthony - real quick question on this application. From what I gather, this is a subscription based service. From a cursory view of the requirements definition and the various brainstorming sections, it appears that there will be shared data, available to all users throughout the environment and then there is data that will be utilized and private to each user. Is this basic assumption correct? If so, are you planning a singular database instance to manage all shared data and private data? The reason for these high level questions is that it appears the technical issue you are planning to overcome is one of performance. If that is the case, I need a high level view of the data structure you are planning.

amaster507 commented 7 years ago

@benkennedy13 welcome to ifbmt!

You should be able to find my thoughts for the data structure so far in the Database Design project.

Yes, Having all of the information in a singular database seems feasible to me. On data that can be shared globally public I think the best way to handle this is with a public boolean column. And every row of data will be linked to the user who created it whether public or private so in that regards then you can easily get all public and private data for an individual by making one call to a singular database. For instance, you could get all of the notes for a singular church visible to a user with: (exact table structure may differ)

SELECT * FROM notes WHERE (link_table='churches' AND link_id=%n) AND (public =1 OR creator_user_id=%n)

A MySQL PHP class will be created to make the actual calls to the database to help ease the load of writing some complex AND/OR statements.

If you have a better idea, I would be happy to listen and make changes.

benkennedy13 commented 7 years ago

Let me read through your database design document Anthony, but you could handle any of your performance issues that you think you will encounter if you change your model. I would personally separate public shared data from individual missionary data and this would solve one problem that you are aware of (future performance issues; as evident in your thought process in this thread), plus basic security issues. I cannot pretend to understand your vision, as yet, as I just started reading through your documentation; which is amazing BTW. It appears to me that the best data structure design would include 2 schemas. One for the shared data that is available to all of missionaries and a second schema dedicated to the data that each missionary creates and maintains. When a new user signs up for the application, the software creates a dedicated data structure to contain that missionary's data. This reduces the query CPU cycles and memory exponentially. Yes this would take more business logic in the account creation script but visualize the performance trade-off. My guess is that you are going to build the application on some VPSs; right? Databases are memory hungry, not as much so CPU. Adding memory to a VPS as needed is cake walk whereas adding CPU capability can open the door to server instability. This model would remove your concerns on performance and make scale-up easy cheesy.

benkennedy13 commented 7 years ago

Oh I almost forgot - you know me; I'm a security person. This also limits the exposure of damage that may or may not occur if a compromise occurs. Each missionary's data is separate from other missionary's data. If a missionary's device is compromised then only the shared data and that missionary's data is potentially vulnerable.

amaster507 commented 7 years ago

This makes sense. Would you put users in a database of their own and use it only for authentication and link to their personal database? Or would users be in the global database? I'm thinking option 1 would be better...

benkennedy13 commented 7 years ago

Either way is fine, but i think I agree with you on option #1. It would probably make it easier from a programming view point, plus you could then run that user database under a separate account and put a ton of security on that particular account. This increases the security context even more.

amaster507 commented 7 years ago

I will rework my database design idea this week to account for these changes.

amaster507 commented 7 years ago

Instead of opening additional issues about database design, I renamed this issue.

@benkennedy13

If databases are separate for private contacts and public contacts then do you have any idea how to share contacts without making them totally public and without duplicating contacts. This might not be a feasible task without breaking the second point and duplicating data.

Here is some information about application logic that plays into this question.

Wiki: #Benefits

personal contacts... can be saved privately and shared with others globally or individually

https://github.com/amaster507/ifbmt/issues/26#issue-256345193

  • Shared contacts will sync updated information
  • But will not share private notes
  • Public notes on personal contacts will be public only for the users that the contact was shared with
  • Only the owner of a shared private contact can switch the contact to a public contact. A received shared contact cannot be switched to a public contact.

https://github.com/amaster507/ifbmt/issues/26#issuecomment-328252085

when adding a new [contact]. [The application will prompt] suggested public contacts for... existing contacts [that may match].

Possible solution: Add another database for shared contacts with the contact tables and a relationship table. When a private contact is shared, it is then transferred to the shared contacts table. This table is less secure, but it is already shared with at least another user which could be less secure on their device already, so this openness may not be that terrible. And when opening a contact the public contacts database is cross referenced for matches by name and contact information and recommend matches. Then there would be an option to merge private and public contact for the user which they may dismiss forever (by storing dismissed duplicates as an array in contact_meta). When private contacts are merged with public contacts there would need to be a process to add additional contact information if not already public and discard duplicated data.

amaster507 commented 7 years ago

OK, so this is becoming more complicated as I think about it more. Events can be public events but still belong to a private calendar. Joining calls between databases is not something I have done before if it is even possible. The best way I know how to do that would be to perform a call to the private database for the private calendars and join the private events and then perform a second call to the public database and pull and merge the remaining event data into the data array.

@benkennedy13 Is this how you would recommend the structure to stay somewhat normalized and optimized for performance?

amaster507 commented 7 years ago

OK, so I just looked it up and it is possible to join between tables in separate databases as long as the MySQL user has proper access rights to both databases.

benkennedy13 commented 7 years ago

Sorry Anthony, I have had to travel a lot in these past few days. It sounds like you have a solution. I am still not clear on all of the data that you will be managing, but you may want to reread the MySQL manual on user roles. It may have some data that could help you. Do you have a basic schema done that I could look at? I could help more if I understood the dataset.

amaster507 commented 7 years ago

To update, I think I have support to drop sharing private contacts which will take out most of the complications. #30. I will work on the basic schema and upload that as soon as I have it ready.

amaster507 commented 6 years ago

@benkennedy13

Private Database(s)

ifbmt_private_

Public Database

ifbmt_public

User Database

ifbmt_users

Please find the database sql code at https://github.com/amaster507/ifbmt/blob/master/databases.sql

amaster507 commented 6 years ago

This has been slightly outdated with changes needed during development, but this remains the same as the core data structure. I will return and update this after more of service is completed.