Is your feature request related to a problem? Please describe.
For far too long has the scanning been too inaccurate. We don't really have a good control of it because we simply cross-collect signatures from places and merge them together to a big database. We do have a small system to remove duplicates and flag false positives, but its manual and takes a while to set up.
Furthermore, it is very complicated to find false positives because our user-base is too small and the database too big.
Describe the solution you'd like
We tested YARA in the past and it was using way too many resources and too slow for our project. But now with the new release of YaraX this has drastically changed. It is even implemented in Rust, making it a perfect fit for our project.
Advantages:
We can have a central repo with Yara rules everyone can help improve and contribute to
On the repo, we can automatically build a YaraX binary, used for the scans later
Using a GitHub release, it is easier to check for database updates and the updates can be performed automatically
YaraX rules are much more accurate and can cover a much bigger range of malware
Yara rules can give us information on why a file was flagged as malware and therefore help users understand the threat much easier
Disadvantages:
It is much slower than simply using signatures
Might use much more resources and slow down the device during scans
Describe alternatives you've considered
We also looked at fuzzy hashing or at the SIMBIoTA project and their TLSH approach from TrendMicro. The problem with those approaches is that we would need to have a collection of malware files to build the signatures onto. Furthermore, we would again have issues with false positives popping up and no idea why or how to fix them. Working with signatures is nice because it is simple and fast, but once you need to understand where a signature comes from it becomes much more difficult to manage.
Is your feature request related to a problem? Please describe. For far too long has the scanning been too inaccurate. We don't really have a good control of it because we simply cross-collect signatures from places and merge them together to a big database. We do have a small system to remove duplicates and flag false positives, but its manual and takes a while to set up. Furthermore, it is very complicated to find false positives because our user-base is too small and the database too big.
Describe the solution you'd like We tested YARA in the past and it was using way too many resources and too slow for our project. But now with the new release of YaraX this has drastically changed. It is even implemented in Rust, making it a perfect fit for our project.
Advantages:
Disadvantages:
Describe alternatives you've considered We also looked at fuzzy hashing or at the SIMBIoTA project and their TLSH approach from TrendMicro. The problem with those approaches is that we would need to have a collection of malware files to build the signatures onto. Furthermore, we would again have issues with false positives popping up and no idea why or how to fix them. Working with signatures is nice because it is simple and fast, but once you need to understand where a signature comes from it becomes much more difficult to manage.
Additional context