Closed evomimic closed 8 months ago
@dauphin3 -- Here is the issue we discussed. When you are ready to start working on this, go ahead and create branch off of branch #36.
Hey @dauphin3 ... I decided to ask ChatGPT for suggestions on the dangling references on write issue we were discussing yesterday. Here is my question and the ChatGPT response:
Suppose in rust I have an Arc plus 3 weak references to an object X. I decide I need to modify X (i.e., create a new version of X). I can use a make_mut
call to get a clone of X (call it X') and return a mutable reference to X'. But that leaves me with dangling references. What are some approaches for "moving" the references from X to X'?
In Rust, when you need to modify a shared object (referenced by an Arc and weak references), you typically can't directly modify it due to Rust's ownership rules. However, you can clone the object and then modify the clone. This leaves the original object intact but creates the need to update the references to point to the new object.
To handle this scenario, you can follow these general approaches:
Atomic Swap: If you need to ensure atomicity and thread safety, you can use atomic operations to swap the old object with the new one. This involves using Arc::swap, which atomically replaces the contents of the Arc. Use Interior Mutability: If you control the object's internals, consider wrapping its fields in Mutex, RwLock, or RefCell, depending on your concurrency requirements. This allows you to mutate the object's state without changing the reference itself. Notify Weak References: Have a mechanism to notify weak references that the object has been replaced. One way is to implement a custom mechanism where weak references are notified when a new version of the object is created. This could involve a callback or some kind of observer pattern. Return a New Arc: Instead of modifying the existing Arc, create a new Arc for the modified object and update all references to point to the new Arc. This ensures that all references now point to the updated version of the object. Here's a simple example illustrating the last approach:
use std::sync::{Arc, Weak};
use std::cell::RefCell;
#[derive(Debug)]
struct Object {
data: String,
}
impl Object {
fn new(data: String) -> Self {
Object { data }
}
}
fn main() {
let obj = Arc::new(RefCell::new(Object::new("original".to_string())));
let weak_ref1 = Arc::downgrade(&obj);
let weak_ref2 = Arc::downgrade(&obj);
let weak_ref3 = Arc::downgrade(&obj);
// Modify the object by creating a new version
let new_obj = Arc::new(RefCell::new(Object::new("modified".to_string())));
// Update references to point to the new object
let updated_weak_ref1 = Arc::upgrade(&weak_ref1).unwrap();
let updated_weak_ref2 = Arc::upgrade(&weak_ref2).unwrap();
let updated_weak_ref3 = Arc::upgrade(&weak_ref3).unwrap();
// Now all weak references point to the updated object
println!("{:?}", updated_weak_ref1.borrow().data); // "modified"
println!("{:?}", updated_weak_ref2.borrow().data); // "modified"
println!("{:?}", updated_weak_ref3.borrow().data); // "modified"
}
The example code provided works fine when the holders of the weak references are known. But what if those weak_references are established opaquely in other functions? Will Arc::swap
work without having to know anything about current references?
ChatGPT
Arc::swap
can indeed be used to update the Arc atomically without needing to know about the current references held by weak references or other holders. This method can be useful in scenarios where multiple threads might be holding references to the Arc simultaneously.
Here's how you can use Arc::swap
:
use std::sync::{Arc, Weak};
use std::cell::RefCell;
#[derive(Debug)]
struct Object {
data: String,
}
impl Object {
fn new(data: String) -> Self {
Object { data }
}
}
fn main() {
let obj = Arc::new(RefCell::new(Object::new("original".to_string())));
let weak_ref1 = Arc::downgrade(&obj);
let weak_ref2 = Arc::downgrade(&obj);
let weak_ref3 = Arc::downgrade(&obj);
// Modify the object by creating a new version
let new_obj = Arc::new(RefCell::new(Object::new("modified".to_string())));
// Swap the old object with the new one atomically
let old_obj = Arc::clone(&obj);
Arc::swap(&obj, &new_obj);
// Now all weak references point to the updated object
let updated_weak_ref1 = weak_ref1.upgrade().unwrap();
let updated_weak_ref2 = weak_ref2.upgrade().unwrap();
let updated_weak_ref3 = weak_ref3.upgrade().unwrap();
println!("{:?}", updated_weak_ref1.borrow().data); // "modified"
println!("{:?}", updated_weak_ref2.borrow().data); // "modified"
println!("{:?}", updated_weak_ref3.borrow().data); // "modified"
}
@dauphin3 Also note... see the newly created Issue #38. You will need to implement that issue along with Issue #37.
The commit
call on the CommitManager should return a Response struct containing an overall status of either Success
or Error
(if any errors). If there are any errors, a the Response should also return a vector of HolonError objects.
Eachcommit
call for a staged holon could be successful or return a HolonError
struct Response {
status: StatusCode,
description: MapString,
errors: Option<Vec<HolonError>>
}
pub enum RequestStatusCode {
Success,
Error,
}
struct HolonError {
holon_key: MapString,
errorCode: HolonError,
description: MapString,
}
I continued the dialog with ChatGPT... see below:
In a single-threaded environment in Rust, you don't have to worry about the complexity of atomic operations because there's no concurrent access to shared data. Therefore, you can directly manipulate the references without the need for atomic swaps.
Here's a modification of the previous example to achieve the desired effect in a single-threaded environment:
use std::sync::{Arc, Weak};
use std::cell::RefCell;
#[derive(Debug)]
struct Object {
data: String,
}
impl Object {
fn new(data: String) -> Self {
Object { data }
}
}
fn main() {
let obj = Arc::new(RefCell::new(Object::new("original".to_string())));
let weak_ref1 = Arc::downgrade(&obj);
let weak_ref2 = Arc::downgrade(&obj);
let weak_ref3 = Arc::downgrade(&obj);
// Modify the object by creating a new version
let new_obj = Arc::new(RefCell::new(Object::new("modified".to_string())));
// Replace the old object with the new one
let old_obj = obj.clone(); // Clone the old Arc
*obj.borrow_mut() = new_obj.borrow().clone(); // Update the contents of the original Arc
// Now all weak references point to the updated object
let updated_weak_ref1 = weak_ref1.upgrade().unwrap();
let updated_weak_ref2 = weak_ref2.upgrade().unwrap();
let updated_weak_ref3 = weak_ref3.upgrade().unwrap();
println!("{:?}", updated_weak_ref1.borrow().data); // "modified"
println!("{:?}", updated_weak_ref2.borrow().data); // "modified"
println!("{:?}", updated_weak_ref3.borrow().data); // "modified"
}
In this single-threaded environment, we simply clone the old Arc
, update the contents of the original Arc
with the new object's contents, and then all weak references continue to point to the updated object. Since there's no concurrent access, this approach is perfectly safe.
You're correct. In a single-threaded environment, Arc isn't strictly necessary since there's no need for atomic reference counting. Regular Rc (reference counting) can be used instead, which is simpler and more lightweight.
Here's the modified example using Rc instead of Arc in the single-threaded case:
use std::rc::{Rc, Weak};
use std::cell::RefCell;
#[derive(Debug)]
struct Object {
data: String,
}
impl Object {
fn new(data: String) -> Self {
Object { data }
}
}
fn main() {
let obj = Rc::new(RefCell::new(Object::new("original".to_string())));
let weak_ref1 = Rc::downgrade(&obj);
let weak_ref2 = Rc::downgrade(&obj);
let weak_ref3 = Rc::downgrade(&obj);
// Modify the object by creating a new version
let new_obj = Rc::new(RefCell::new(Object::new("modified".to_string())));
// Replace the old object with the new one
*obj.borrow_mut() = new_obj.borrow().clone();
// Now all weak references point to the updated object
let updated_weak_ref1 = weak_ref1.upgrade().unwrap();
let updated_weak_ref2 = weak_ref2.upgrade().unwrap();
let updated_weak_ref3 = weak_ref3.upgrade().unwrap();
println!("{:?}", updated_weak_ref1.borrow().data); // "modified"
println!("{:?}", updated_weak_ref2.borrow().data); // "modified"
println!("{:?}", updated_weak_ref3.borrow().data); // "modified"
}
In this single-threaded scenario, Rc
suffices for reference counting, and you can directly manipulate the references without worrying about thread safety.
Until we can leverage the Ephemeral Store, the CommitManager
only lives for the lifetime of a single extern request -- so exposing its methods as an extern API doesn't make sense. This makes testing it through sweetest somewhat problematic.
The CommitManager is actually overkill for most common scenarios. The need for it arose in the context of loading the L0 Core Schema where multiple holons and relationships are being staged all at the same time. So I'm thinking that is the best context in which to test it.
I recommend the following strategy.
1) In the descriptors coordinator zome, create a descriptors_api.rs
file that defines an externally visible load_core_schema
function:
#[hdk_extern]
pub fn load_core_schema_api(_:()) -> ExternResult<Holon> {
Ok(load_core_schema().0)
}
Notice this function just delegates the call onto the loader.rs' load_core_schema()
function. We can progressively enrich the implementation of this function to create more and more of the schema until it is fully loading the schema.
2) We just need a thin test driver. Within crates/coordinator/descriptors/tests, create a schema_load_test.rs
file to serve as the test driver. Basically it just need to do a zome call via the conductor on the the load_core_schema_api
function and verify it returns a Holon result and not an error.
It is within the load_core_schema
function that the Context and CommitManager need to be created.
As part of the Staging process, the CommitManager:
index
and thestaged_holons