atsign-foundation / at_server

The software implementation of Atsign's core technology
https://docs.atsign.com
BSD 3-Clause "New" or "Revised" License
39 stars 13 forks source link

Propose Solution for Efficient Sync with Many Delete Entries #2066

Open VJag opened 4 weeks ago

VJag commented 4 weeks ago

Is your feature request related to a problem? Please describe.

The current synchronization process for atServer is facing inefficiencies, particularly when handling large numbers of deletions. This impacts the performance and scalability of the system. This ticket's objective is to propose a solution to improve synchronization efficiency in scenarios involving numerous delete entries, ensuring the solution is scalable and independent of Hive.

Current Design Overview:

CRUD Operations:

Data Storage: atServer stores data as key-value pairs. Key Management: Keys can be created, deleted, or automatically expired using the ttl (time to live) parameter. Expired Key Cleanup: A cron job deletes expired keys. Key Storage: All keys are stored in a Hive box named KeyStore.

Commit Log:

Operation Logging: Key creation or updates are logged in a Hive box called CommitLog with an auto-generated sequence number. Recording Changes: Each operation is recorded with a new sequence number. Single Entry per Key: The CommitLog maintains one entry per unique key.

In-Memory Compact CommitLog:

In-Memory Representation: atServer keeps an in-memory map of the CommitLog to optimize synchronization. Sync Efficiency: This map supports efficient synchronization operations.

Sync Process:

Client Connections: Multiple clients can be connected to an atServer. Data Synchronization: Clients sync data with the atServer, which assigns a commit ID. Clients record this ID locally. Sync Status: A data item with a server commit ID indicates it is synced. Managing Sync Differences: Clients must update their local commit ID before pushing new data if their ID is lower than the server's latest ID.

Current Design Issues:

Inefficient Sync with Many Deletions:

Describe the solution you'd like

Propose a Solution for Efficient Sync with Many Delete Entries: Scalable and Hive-Agnostic

Describe alternatives you've considered

No response

Additional context

No response

VJag commented 3 weeks ago

The ultimate goal of any optimizations to the sync process is to enable a client to create data and make it available to the server in the shortest possible time, while also reducing both time and storage requirements by syncing fewer entries. Clients may have varying synchronization requirements, which can be addressed through the following strategies:

1. Clients Concerned Only with New State Example: An SSHNP client, which is only concerned with new, future SSH sessions and the keys associated with them.

Behavior: Such a client doesn’t need to be aware of any previous state. It should have the flexibility to either operate independently of the server's current state or sync with the server without retrieving any existing keys.

Benefit: This approach allows the client to create data that is immediately available to the server without the overhead of syncing old or irrelevant data.

2. Initial Sync vs. Delta Sync Initial Sync: During an initial sync, the server can skip sending deleted entries to the client, allowing the client to quickly synchronize with the server.

Delta Sync: After a client has performed an initial sync, it can use delta syncs to receive only changes (additions, updates, deletions) since the last sync.

Sync Type Flag: Introducing a syncType: initial|delta flag in the sync operation would enable the server to optimize syncs by not sending deleted entries during the initial sync. This would help the client get in sync faster, allowing it to start creating data that is immediately available to the server.

Edge Case: If the last entry in the sync process is a deletion, the client might fall out of sync by one entry. This scenario should be managed gracefully to ensure synchronization integrity.

3. Skipping Expired Keys During the Sync Flag: The skipSyncEntriesOfExpiredKeys: true|false flag would instruct the server not to send sync entries for expired keys during the synchronization process. This is useful when the client can handle the deletion of expired keys locally.

Edge Case: As with deletions, if the last entry to sync is a deletion of an expired key, the client may become out of sync by one entry. Handling this scenario is crucial to maintaining consistent synchronization.

Technical Implementation To technically implement these optimizations, the following capabilities are needed:

VJag commented 3 weeks ago

A client can have four types of synchronization requirements:

Always-Online Clients: These clients need to be continuously connected to the remote secondary for their operations. Their primary focus is on reading from and writing to the server directly, without relying on cached data. These clients do not require synchronization to access cached data as they do not depend on it.

Ex: SSHNoports code has completely disable the sync and put and get talks to remote secondary directly

1. Snippet from SSHNoPorts code where we are creating an atClient with NoOp sync service

atClientGenerator: (SshnpdParams p) => createAtClientCli( atsign: p.deviceAtsign, atKeysFilePath: p.atKeysFilePath, rootDomain: p.rootDomain, storagePath: p.storagePath, namespace: DefaultArgs.namespace, atServiceFactory: ServiceFactoryWithNoOpSyncService(), ),

2. Get and Put request options to bypass local secondary and write to remote secondary directly

/// Parameters that application code can optionally provide when calling /// AtClient.get class GetRequestOptions extends RequestOptions { /// Whether the get request should bypass this atSign's cache of data owned /// by another atSign bool bypassCache = false; }

/// Parameters that application code can optionally provide when calling /// AtClient.put class PutRequestOptions extends RequestOptions { /// Whether to set the sharedKeyEnc and pubKeyCS properties on the /// Metadata for this put request bool storeSharedKeyEncryptedMetadata = true;

/// Whether to send this update request directly to the remote atServer bool useRemoteAtServer = false; }

3. put with request options

await atClient.put(key, params.toJson(), putRequestOptions: options);

Push-Only Clients: These clients are only concerned with sending new data to the server and do not care about any previous state. If they go offline, they can queue requests and push them to the server when they reconnect.

Full Sync Clients: These clients need to fetch data from the server before performing any operations. Applications like Buzz and Wavi typically fall into this category. The default sync behavior in the atClient SDK is designed to cater to these clients.

Sync Requirement: These clients must complete synchronization before they can push any new data to the server.

Selective Sync Clients: These clients require only a specific subset of data from the server before creating new data and syncing it back. For example, if a client starts with no data and only needs keys key1, key2, and key5 from the server, it will sync those keys and then proceed to interact with the server using that limited dataset.

When developing an application, it’s crucial to understand your client's sync requirements, as each type can significantly impact performance.

VJag commented 3 weeks ago

Actionable Next Steps Based on the Analysis:

Enable the Feature to Exclude Commit Log Entries for Expired Key Deletions on the Server:

Introduce the Sync Type Flag:

Analyze Sync Issues with atcolin:

Conduct an Architectural Discussion to Evaluate the Need for Direct Support in SDK for:

VJag commented 3 weeks ago

Analyze Sync Issues with atcolin. @purnimavenkatasubbu can you start on this, please?