Open wojciech-graj opened 1 week ago
I think that Send
can be implemented for the Scanner
type without issues, allowing it's use from async
functions. It's safe to pass the ownership of the scanner from one thread to another one. However, it's not safe to use the same scanner from different threads concurrently, therefore Sync
can not be implemented. It's not only a matter of changing Rc
to Arc
and solve the other issues, it's that the scanner has data structure that tracks the progress of the current scan, and therefore two scan operations can't run concurrently.
I was trying to replicate the issue, but this works for me:
use std::io;
#[tokio::main]
async fn main() -> io::Result<()> {
let handle = tokio::spawn(async {
let rules = yara_x::compile("rule test { condition: true }").unwrap();
let mut scanner = yara_x::Scanner::new(&rules);
let scan_results = scanner.scan(b"").unwrap();
println!("Matching {} rules", scan_results.matching_rules().len())
});
handle.await?;
println!("Finished");
Ok(())
}
Can you provide some example that fails to compile?
Answering myself... this reproduces the issue:
use std::io;
use tokio::task::yield_now;
#[tokio::main]
async fn main() -> io::Result<()> {
let handle = tokio::spawn(async {
let rules = yara_x::compile("rule test { condition: true }").unwrap();
let mut scanner = yara_x::Scanner::new(&rules);
yield_now().await;
let scan_results = scanner.scan(b"").unwrap();
yield_now().await;
println!("Matching {} rules", scan_results.matching_rules().len())
});
handle.await?;
println!("Finished");
Ok(())
}
@wojciech-graj can you try this branch https://github.com/VirusTotal/yara-x/tree/make-scanner-send?
I took a look at the branch, and I think the approach of manually implementing Send is probably fine iff at least points 1 and 2 are addressed. Rc
doesn't guarantee that the value within it is in the same state on different threads, so if the Scanner
is used on thread 1, then gets moved to thread 2, any internal state stored in an Rc
will be in an undefined state.
What do you mean with this?
Rc doesn't guarantee that the value within it is in the same state on different threads, so if the Scanner is used on thread 1, then gets moved to thread 2, any internal state stored in an Rc will be in an undefined state.
Can you elaborate more?
Rc
stands for "Reference Counted", so a value that lives somewhere, and will be freed once it ceases to be referenced. Internally an Rc
stores a counter, which is incremented whenever a new reference is created, and which is decremented whenever a reference is dropped. This counter is not atomic and is not guaranteed to be the same across all threads (there is no memory barrier), so you could have the following situation
Thread 1: create Rc with counter = 1
Move Scanner to thread 2
Thread 2: increment Rc's counter
Move Scanner to thread 1
Thread 1: decrement Rc's counter from 1 to 0 (Rc's counter is 1 because thread 2's counter hasn't been propagated from cache to main memory, or thread 1 didn't get a cache miss when fetching value of counter)
Thread 1: free value stored in Rc, despite there being an existing reference to it that was created by thread 2
Now I see what you mean, I didn't take into account cache issues. I think we must find some alternative, like creating some other type that is both Send
+ Sync
by protecting the Scanner
with a mutex or something like that.
I don't see the change from Rc
to Arc
as a good solution because we would be paying the performance penalty of using Arc
in a case where multiple threads can't be used anyways. In other words, we would be using thread-safe references in a context where the scanner can't be shared by multiple threads, that's an overkill.
Also, as far as I know the async code will fail to compile only if you have some await
after creating the Scanner
and before you use the scanning results. For instance, this won't compile:
tokio::spawn(async {
let rules = yara_x::compile("rule test { condition: true }").unwrap();
let mut scanner = yara_x::Scanner::new(&rules);
yield_now().await;
let scan_results = scanner.scan(b"").unwrap();
println!("Matching {} rules", scan_results.matching_rules().len())
});
But this other code will compile:
tokio::spawn(async {
let rules = yara_x::compile("rule test { condition: true }").unwrap();
{
let mut scanner = yara_x::Scanner::new(&rules);
let scan_results = scanner.scan(b"").unwrap();
println!("Matching {} rules", scan_results.matching_rules().len())
}
yield_now().await;
});
So, you can use the scanner inside an async block, as long as the async code doesn't yield from the moment you build the scanner until the moment you process the results. This is not ideal when you want to re-use the scanner for scanning multiple files, but it in certain cases it would work without needing Send
at all.
In resume, I think it's a good idea to provide some type that is both Sync
and Send
that can be easily used in async code, but the solution shouldn't be using Arc
everywhere.
It turns out that my idea of creating some other type that wraps an Scanner
inside a Mutex
won't work. By using a mutex you can convert Send + !Sync
into Send + Sync
. But in this case we want to turn !Send + !Sync
into Send + Sync
. That's not possible without explicitly implementing Send
and Sync
for the wrapper type.
I created a branch on which I have implemented changes 1 and 2, you can find it here: https://github.com/wojciech-graj/yara-x/tree/send-scanner
I scanned a few GB of files filled with random data, with the https://github.com/Yara-Rules/rules ruleset, and below are my results
Rc:
cargo r --release scan ../rules/**/*.yar ../sample -r 776.96s user 4.74s system 647% cpu 2:00.75 total
cargo r --release scan ../rules/**/*.yar ../sample -r 768.74s user 5.05s system 629% cpu 2:02.83 total
Arc:
cargo r --release scan ../rules/**/*.yar ../sample -r 763.22s user 4.74s system 642% cpu 1:59.55 total
cargo r --release scan ../rules/**/*.yar ../sample -r 772.34s user 5.15s system 633% cpu 2:02.82 total
There appears to be no performance difference when switching to Arc
from Rc
. Please feel free to run more benchmarks, but at least based on these numbers, I see no reason to not switch to Arc
.
I agree that the performance penalty will be probably negligible, but I still believe that changing every Rc
to Arc
inside a type that is intended to be used in single-thread environment is not a good design decision. In one hand I'm tempted to make this change because it facilitates the use of the yara-x
crate in async code, but in the other hand I think that the async requirements are forcing me into a change that doesn't feel natural at all, this is more like a workaround than a change that feels logical to me.
I don't have too much experience with async Rust, but it looks like there are solutions for using !Send
types in async code:
https://procmarco.com/blog/2021-05-04-a-story-about-async-rust-and-using-send-types/ https://ryhl.io/blog/actors-with-tokio/
I haven't delved into this, and I'm aware that it puts the burden on the users, but for the time being I think that's the right approach.
There's an additional reason for not allowing the scanner to be moved freely from one thread to another. Some yara-x
modules use thread local storage, for instance, the magic
module maintains a copy of its signature database per-thread, because libmagic
is not a thread-safe library. In an environment like tokio
, where the runtime can decide to move the scanner from one thread to another freely, this imply that you can have a lot of copies of this database even if you have a single scanner, just because you don't control how the scanner is assigned to a thread So, there are subtle implications that may surprise the users of the crate. This an additional reason for keeping the scanner as !Send
, it explicitly tells the crate users that moving the scanner from one thread to another is not a good idea, and that other solutions that involve assigning a scanner to a specific thread are preferable.
Your last paragraph is probably the best argument for keeping is single-threaded and !Send
. I still think that it would be a good idea to eventually make the scanner Send
to make it more usable in the async world, but I agree that for now it's probably best to keep it as-is.
Adding the
Send
trait to theScanner
was briefly discussed in #186, but I found the answer unconvincing. The explanation was that aScanner
cannot safely be used by many threads simultaneously, however that's what theSync
trait is responsible for, while theSend
trait simply means that theScanner
could be moved between threads.The main use-case for this is in asynchronous code. It is currently impossible to spawn an async task and create a scanner within it, even though the scanner will only be used by one thread at any given time, because the task can run on any thread. The following code will fail to compile.
Below are the roadblocks to implementing
Send
:Rc
is used instead of its thread-safe counterpartArc
. (easily changeable)ScanContext::console_log
needs to be changed fromOption<Box<dyn FnMut(String) + 'r>>
toOption<Box<dyn FnMut(String) + Send + 'r>>
(easily changeable)ScanContext::scanned_data
andScanContext::wasm_store
are pointers.scanned_data
appears to be a pointer as a workaround to the borrow-checker's complaints, and there doesn't appear to be a trivial way to change it to anOption<&[u8]>
.