Open theduke opened 4 years ago
@theduke , the File
struct should be able support read-write lock concurrently, could you share some example code?
Ah, I realized that I can open multiple file handles and then use them concurrently.
Since zbox does support internal concurrency/locking, it would be quite nice to hoist all of this into the Repo
struct and have all methods on Repo
take &self
rather than &mut self
.
In my opinion, &mut self
has semantic meaning, that is, if you're going to change it you need to borrow it as mutable. Of course we can make it all &self
but by doing that we lost the borrow semantics. For concurrent, you can always wrap a repo in Arc
or other synchronous types, and that might suit your need.
In my opinion, &mut self has semantic meaning, that is, if you're going to change it you need to borrow it as mutable.
mut
should not be interpreted as "mutable", but as "exclusive".
It seems to fail to support working on multiple files even without threads.
I get zbox::Error::InTrans
if I try to work on multiple files simultaneously.
zbox::File
behaves as if it wants exclusive access to Repo.
It seems to fail to support working on multiple files even without threads.
I get
zbox::Error::InTrans
if I try to work on multiple files simultaneously.
zbox::File
behaves as if it wants exclusive access to Repo.
Can you share your code about this issue? One File
cannot be mutated simultaneously, but multiple files should able to do that.
Can you share your code about this issue? One File cannot be mutated simultaneously, but multiple files should able to do that.
fn main() -> Result<(), Box<dyn std::error::Error>> {
env_logger::init();
zbox::init_env();
use std::io::Write;
let mut r = zbox::RepoOpener::new().create(true).force(true).open("file:///tmp/limit/q", "qwe123")?;
eprintln!("Opening file1");
let mut f1 = r.create_file("/test1")?;
eprintln!("Opening file2");
let mut f2 = r.create_file("/test2")?;
eprintln!("Writing file1");
f1.write_all(b"12345")?;
eprintln!("Writing file2");
f2.write_all(b"qwerty")?;
eprintln!("Committing file1");
f1.finish()?;
eprintln!("Committing file2");
f2.finish()?;
Ok(())
}
Running `target/debug/zboxtest`
Opening file1
Opening file2
Writing file1
Writing file2
Error: Custom { kind: Other, error: "Already in transaction" }
Thanks @vi for your code. The reason for this because content manager needs to be exclusively accessed. In writing file1, its transaction will exclusively include the content manager, thus when writing file2 the transaction cannot get access to it any more.
One File cannot be mutated simultaneously, but multiple files should able to do that. content manager needs to be exclusively accessed
Is it a bug or by design?
One File cannot be mutated simultaneously, but multiple files should able to do that. content manager needs to be exclusively accessed
Is it a bug or by design?
Well, it is by design currently but I think a tx should not hold a shared object too long until finish
is called. Some optimization could be done to deal with this situation.
Can two zbox::File
s be sent to two threads that will write and finish multiple transactions on their files independently, without coordination?
If no then it should be documented.
Ok, after digging through this issue deeper I found it is actually not the exclusive access to content manager caused this issue, sorry about my misleading explanation before.
The real reason is that each thread can only run one transaction at a time. In the code below,
let mut r = zbox::RepoOpener::new().create(true).force(true).open("file:///tmp/limit/q", "qwe123")?;
let mut f1 = r.create_file("/test1")?;
let mut f2 = r.create_file("/test2")?;
f1.write_all(b"12345")?; // this will create one ongoing tx
f2.write_all(b"qwerty")?; // this will create another ongoing tx
f1.finish()?;
f2.finish()?;
There are 2 files are using multi-part writing, 2 write_all()
created 2 parallel txs in one thread, thus it reports an error. There are 2 solutions:
finish()
sequentially f1.write_all(b"12345")?;
f1.finish()?;
f2.write_all(b"qwerty")?;
f2.finish()?;
let child1 = std::thread::spawn(move || {
f1.write_all(b"12345").unwrap();
f1.finish().unwrap();
});
let child2 = std::thread::spawn(move || {
f2.write_all(b"qwerty").unwrap();
f2.finish().unwrap();
});
child1.join().unwrap();
child2.join().unwrap();
Is ZboxFS supposed to be used in async/Tokio context? (where thread may be unpredictable).
Does it rely on thread-local storage?
Can transactions be untied from threads? There can be explicit object zbox::Transaction
that can open files, which would be bound to that transaction instead of to thread-default transaction.
Or is there some definite way to check if it is OK to write a file? (when running from thread pool where tasks can migrate between threads on its own)
Is only way to reliably use ZboxFS without fully controlling threading in application is to consult a map from std::thread::ThreadId
before every write?
Is ZboxFS supposed to be used in async/Tokio context? (where thread may be unpredictable).
It hasn't used async yet because that is a big change and may change the internal structure as well.
Does it rely on thread-local storage?
Yes, it stores tx id in thread local storage.
Can transactions be untied from threads? There can be explicit object
zbox::Transaction
that can open files, which would be bound to that transaction instead of to thread-default transaction.Or is there some definite way to check if it is OK to write a file? (when running from thread pool where tasks can migrate between threads on its own) Is only way to reliably use ZboxFS without fully controlling threading in application is to consult a map from
std::thread::ThreadId
before every write?
Currently, threading is tightly coupled with tx control internally, so it is hard to expose it as API and get it right. Also, this will add complexity for developing using ZboxFS.
As for the thread pool use case, I think app has to stick each file to different thread at this moment. If app cannot control that, it might be a limitation of ZboxFS.
It seems like zbox does not currently allow for any multi-threaded concurrency at all.
Sadly this limits usage to very simple use cases.
Are there plans for introducing concurrency?
A "simple" first step could be to allow concurrent reads, with a global write lock. Although I'd really love to have concurrent reads and writes as well.