VirusTotal / yara-python

The Python interface for YARA
http://virustotal.github.io/yara/
Apache License 2.0
637 stars 178 forks source link

Point release 4.3 changes an important API interface #226

Closed mrichard91 closed 7 months ago

mrichard91 commented 1 year ago

The 4.3 release changes the behavior of yara.Match in a way that is not backward compatible with previous versions since 3.4. The functionality this enables is great, but will cause existing code using match strings metadata that worked up until 4.3 to break. This functionality could be an additional field to yara.Match or be provided in a new interface like yara.ExtendedMatch instead of breaking backward compatibility.

This does not seem to be mentioned in the release notes. I suspect anyone using the .strings of yara.Match will need to discover this by code breaking.

Would it be possible in 4.3.1 to restore the previous behavior and instead create a new or backward compatible implementation?

plusvic commented 1 year ago

Looking in retrospective I agree that we could have done this in a backward-compatible way. However, I think that reverting the change in a 4.3.1 version will make it even more messy. People who already adapted their code to this change would experience another break in a future release.

I put a brief warning in the release notes, and will make changes to the documentation to cover this change more explicitly.

SonOfLilit commented 1 year ago

Given that most serious users stayed with 4.2.3 because of the build failures, and not everyone has good tests or pays attention to release notes of minor releases, I think it's not too late to act according to @mrichard91's suggestion and reduce the amount of breakage in the world :-)

seanthegeek commented 1 year ago

Agreed with @mrichard91 and @SonOfLilit. This is a huge change that needs to be reverted. Previously, the list of strings was a single match, and the list was sorted by offset. in 4.3.0, the strings are now grouped by identifier, so the developer needs to iterate through each identifier to reproduce the old sorting. For now, I've pinned yara-python to 4.2.3 in my project, and I'm working on logic to detect if a string is not a tuple, recreate a tuple, add it to my own strings list, and sort the list by offset.

seanthegeek commented 1 year ago

Here is the approach I took to support 4.2.3 and 4.2.0 in my application.

plusvic commented 7 months ago

As even more time has passed a revert doesn't make sense. Sorry for the API break.