Implement Basic CommitManager

As part of the Staging process, the CommitManager:

[x] Keeps a set of shared references to Holons pending commit.
[x] Keeps a separate index based on holon-specific key strings
[x] Offers a “stage” function that allows holons to be added to this set
[x] Offers a "get_holon_by_key" function that searches the set for holons that match a supplied key
[x] Offers a clear_staged_objects() method that clears both the index and the staged_holons
[x] Offers a "commit" function that iterates over the set of staged holons, invoking commit() on each
[x] that passes sweetests

@dauphin3 -- Here is the issue we discussed. When you are ready to start working on this, go ahead and create branch off of branch #36.

Hey @dauphin3 ... I decided to ask ChatGPT for suggestions on the dangling references on write issue we were discussing yesterday. Here is my question and the ChatGPT response:

Question:

Suppose in rust I have an Arc plus 3 weak references to an object X. I decide I need to modify X (i.e., create a new version of X). I can use a make_mut call to get a clone of X (call it X') and return a mutable reference to X'. But that leaves me with dangling references. What are some approaches for "moving" the references from X to X'?

ChatGPT Response:

In Rust, when you need to modify a shared object (referenced by an Arc and weak references), you typically can't directly modify it due to Rust's ownership rules. However, you can clone the object and then modify the clone. This leaves the original object intact but creates the need to update the references to point to the new object.

To handle this scenario, you can follow these general approaches:

Atomic Swap: If you need to ensure atomicity and thread safety, you can use atomic operations to swap the old object with the new one. This involves using Arc::swap, which atomically replaces the contents of the Arc. Use Interior Mutability: If you control the object's internals, consider wrapping its fields in Mutex, RwLock, or RefCell, depending on your concurrency requirements. This allows you to mutate the object's state without changing the reference itself. Notify Weak References: Have a mechanism to notify weak references that the object has been replaced. One way is to implement a custom mechanism where weak references are notified when a new version of the object is created. This could involve a callback or some kind of observer pattern. Return a New Arc: Instead of modifying the existing Arc, create a new Arc for the modified object and update all references to point to the new Arc. This ensures that all references now point to the updated version of the object. Here's a simple example illustrating the last approach:

use std::sync::{Arc, Weak};
use std::cell::RefCell;

#[derive(Debug)]
struct Object {
    data: String,
}

impl Object {
    fn new(data: String) -> Self {
        Object { data }
    }
}

fn main() {
    let obj = Arc::new(RefCell::new(Object::new("original".to_string())));
    let weak_ref1 = Arc::downgrade(&obj);
    let weak_ref2 = Arc::downgrade(&obj);
    let weak_ref3 = Arc::downgrade(&obj);

    // Modify the object by creating a new version
    let new_obj = Arc::new(RefCell::new(Object::new("modified".to_string())));

    // Update references to point to the new object
    let updated_weak_ref1 = Arc::upgrade(&weak_ref1).unwrap();
    let updated_weak_ref2 = Arc::upgrade(&weak_ref2).unwrap();
    let updated_weak_ref3 = Arc::upgrade(&weak_ref3).unwrap();

    // Now all weak references point to the updated object
    println!("{:?}", updated_weak_ref1.borrow().data); // "modified"
    println!("{:?}", updated_weak_ref2.borrow().data); // "modified"
    println!("{:?}", updated_weak_ref3.borrow().data); // "modified"
}

My Follow On Question

The example code provided works fine when the holders of the weak references are known. But what if those weak_references are established opaquely in other functions? Will Arc::swap work without having to know anything about current references?

ChatGPT Response

ChatGPT Arc::swap can indeed be used to update the Arc atomically without needing to know about the current references held by weak references or other holders. This method can be useful in scenarios where multiple threads might be holding references to the Arc simultaneously.

Here's how you can use Arc::swap:

use std::sync::{Arc, Weak};
use std::cell::RefCell;

#[derive(Debug)]
struct Object {
    data: String,
}

impl Object {
    fn new(data: String) -> Self {
        Object { data }
    }
}

fn main() {
    let obj = Arc::new(RefCell::new(Object::new("original".to_string())));
    let weak_ref1 = Arc::downgrade(&obj);
    let weak_ref2 = Arc::downgrade(&obj);
    let weak_ref3 = Arc::downgrade(&obj);

    // Modify the object by creating a new version
    let new_obj = Arc::new(RefCell::new(Object::new("modified".to_string())));

    // Swap the old object with the new one atomically
    let old_obj = Arc::clone(&obj);
    Arc::swap(&obj, &new_obj);

    // Now all weak references point to the updated object
    let updated_weak_ref1 = weak_ref1.upgrade().unwrap();
    let updated_weak_ref2 = weak_ref2.upgrade().unwrap();
    let updated_weak_ref3 = weak_ref3.upgrade().unwrap();

    println!("{:?}", updated_weak_ref1.borrow().data); // "modified"
    println!("{:?}", updated_weak_ref2.borrow().data); // "modified"
    println!("{:?}", updated_weak_ref3.borrow().data); // "modified"
}

@dauphin3 Also note... see the newly created Issue #38. You will need to implement that issue along with Issue #37.

Error Handling

The commit call on the CommitManager should return a Response struct containing an overall status of either Success or Error (if any errors). If there are any errors, a the Response should also return a vector of HolonError objects.
Eachcommit call for a staged holon could be successful or return a HolonError
- on Success
  - remove the committed entry from the CommitManager's staged list
- on Error
  - set the overall response status to Error
  - create a HolonError object (see below)
  - add the error object to the error vector
  - we need to decide what to do with the staged holon -- more on that later

struct Response {
    status: StatusCode,
    description: MapString,
    errors: Option<Vec<HolonError>>
}

pub enum RequestStatusCode {
    Success,
    Error,
}

struct HolonError {
    holon_key: MapString,   
    errorCode: HolonError,
    description: MapString,
}

Simple Use of Arc and Rc

I continued the dialog with ChatGPT... see below:

Me: Can a similar effect be achieved in a single-threaded environment in rust?

In a single-threaded environment in Rust, you don't have to worry about the complexity of atomic operations because there's no concurrent access to shared data. Therefore, you can directly manipulate the references without the need for atomic swaps.

Here's a modification of the previous example to achieve the desired effect in a single-threaded environment:

use std::sync::{Arc, Weak};
use std::cell::RefCell;

#[derive(Debug)]
struct Object {
    data: String,
}

impl Object {
    fn new(data: String) -> Self {
        Object { data }
    }
}

fn main() {
    let obj = Arc::new(RefCell::new(Object::new("original".to_string())));
    let weak_ref1 = Arc::downgrade(&obj);
    let weak_ref2 = Arc::downgrade(&obj);
    let weak_ref3 = Arc::downgrade(&obj);

    // Modify the object by creating a new version
    let new_obj = Arc::new(RefCell::new(Object::new("modified".to_string())));

    // Replace the old object with the new one
    let old_obj = obj.clone(); // Clone the old Arc
    *obj.borrow_mut() = new_obj.borrow().clone(); // Update the contents of the original Arc

    // Now all weak references point to the updated object
    let updated_weak_ref1 = weak_ref1.upgrade().unwrap();
    let updated_weak_ref2 = weak_ref2.upgrade().unwrap();
    let updated_weak_ref3 = weak_ref3.upgrade().unwrap();

    println!("{:?}", updated_weak_ref1.borrow().data); // "modified"
    println!("{:?}", updated_weak_ref2.borrow().data); // "modified"
    println!("{:?}", updated_weak_ref3.borrow().data); // "modified"
}

In this single-threaded environment, we simply clone the old Arc, update the contents of the original Arc with the new object's contents, and then all weak references continue to point to the updated object. Since there's no concurrent access, this approach is perfectly safe.

Me: Is Arc even needed in the single-threaded case?

You're correct. In a single-threaded environment, Arc isn't strictly necessary since there's no need for atomic reference counting. Regular Rc (reference counting) can be used instead, which is simpler and more lightweight.

Here's the modified example using Rc instead of Arc in the single-threaded case:

use std::rc::{Rc, Weak};
use std::cell::RefCell;

#[derive(Debug)]
struct Object {
    data: String,
}

impl Object {
    fn new(data: String) -> Self {
        Object { data }
    }
}

fn main() {
    let obj = Rc::new(RefCell::new(Object::new("original".to_string())));
    let weak_ref1 = Rc::downgrade(&obj);
    let weak_ref2 = Rc::downgrade(&obj);
    let weak_ref3 = Rc::downgrade(&obj);

    // Modify the object by creating a new version
    let new_obj = Rc::new(RefCell::new(Object::new("modified".to_string())));

    // Replace the old object with the new one
    *obj.borrow_mut() = new_obj.borrow().clone();

    // Now all weak references point to the updated object
    let updated_weak_ref1 = weak_ref1.upgrade().unwrap();
    let updated_weak_ref2 = weak_ref2.upgrade().unwrap();
    let updated_weak_ref3 = weak_ref3.upgrade().unwrap();

    println!("{:?}", updated_weak_ref1.borrow().data); // "modified"
    println!("{:?}", updated_weak_ref2.borrow().data); // "modified"
    println!("{:?}", updated_weak_ref3.borrow().data); // "modified"
}

In this single-threaded scenario, Rc suffices for reference counting, and you can directly manipulate the references without worrying about thread safety.

Test Strategy

Until we can leverage the Ephemeral Store, the CommitManager only lives for the lifetime of a single extern request -- so exposing its methods as an extern API doesn't make sense. This makes testing it through sweetest somewhat problematic.

The CommitManager is actually overkill for most common scenarios. The need for it arose in the context of loading the L0 Core Schema where multiple holons and relationships are being staged all at the same time. So I'm thinking that is the best context in which to test it.

I recommend the following strategy. 1) In the descriptors coordinator zome, create a descriptors_api.rs file that defines an externally visible load_core_schema function:

#[hdk_extern]
pub fn load_core_schema_api(_:()) -> ExternResult<Holon> {
    Ok(load_core_schema().0)
}

Notice this function just delegates the call onto the loader.rs' load_core_schema() function. We can progressively enrich the implementation of this function to create more and more of the schema until it is fully loading the schema.

2) We just need a thin test driver. Within crates/coordinator/descriptors/tests, create a schema_load_test.rs file to serve as the test driver. Basically it just need to do a zome call via the conductor on the the load_core_schema_api function and verify it returns a Holon result and not an error.

It is within the load_core_schema function that the Context and CommitManager need to be created.

evomimic / map-holons