Closed jvoisin closed 2 years ago
CCing @hdm in case you have any concerns with these changes when used in your projects.
@jvoisin - Thanks for the contribution. In general I'm on board with this. I'm going to let a few others weigh in this to ensure that I'm not overlooking something. Massive PRs like this are the reason why we don't allow new fingerprints that don't provide examples to test against. It makes me feel so much better when every fingerprint is tested against changes.
Also, as a note. There are changes to the newline at the end of these files. This may be due to how you have line ending set (windows vs unix) and/or you may need to run ruby bin/recog_cleanup
.
I am onboard with this and am really glad the test coverage is great =D
I believe this will cause many regressions for anything using recog-java (https://github.com/rapid7/recog-java), since it uses java.util.regex.Pattern
, and Java's Pattern
requires anchoring...
$ cat Test.java
import java.util.regex.Pattern;
public class Test {
public static void main(String[] args) {
System.out.println(Pattern.matches("(iSeries)", "IBM i5/OS iSeries (OS/400)"));
System.out.println(Pattern.matches(".*(iSeries).*", "IBM i5/OS iSeries (OS/400)"));
}
}
$ java -cp . Test
false
true
Then it has been broken for a long time, because a lot of patterns don't have anchoring :/
@tsellers-r7 can we run tests for recog-java on these updated FPs? I'm slammed at the moment, can get around to it in a bit if you can't.
It's a pain to test recog-java against branches/PRs, but the tests are only run against fingerprints with examples anyway. So as-is the tests "pass" against this change, but adding an example to one of the modified fingerprints does result in a failure.
If this is a desirable change for recog then there needs to be coordination with recog-java and nexpose to handle the change by adding anchors at runtime or something. So I would hold off on merging this PR until that is sorted out.
I would guess that the existing patterns without anchors are not used by recog-java/nexpose today, or they don't work as expected when used. :disappointed:
Thanks for the feedback folks, this is what I was concerned about. There is nothing driving this PR other than just cleaning up the regex a bit. If it's going to break recog-java
and/or product then we can cancel it or put it on hold.
I would guess that the existing patterns without anchors are not used by recog-java/nexpose today, or they don't work as expected when used.
This part concerns me. Is there is anything that can be done to improved the testing situation with recog-java
?
Is there is anything that can be done to improved the testing situation with recog-java?
One thing I've wanted to do is automate recog-java's integration tests against every release of recog, but I don't know how that could be easily triggered or how a failure would notify out to maintainers here. It would probably be easy to have this repo checkout recog-java to run tests. At that point it might as well run on PRs as well. That feels like crossing a weird boundary, but I haven't thought of anything better yet. We might want some meta-recog project to test the entire ecosystem or something. :snail:
That doesn't help for the lack of examples for many fingerprints, though, which is how the tests are currently driven.
How do other consumers/implementations of recog handle it?
Ruby is covered by the main project. For Go, we have a similar implementation of recog_verify that helps sanity check our implementation vs the main project. We also do some sanity checks on the Rumble side with the Recog databases at build time. Recog-GO doesn't do CI yet for this, but that should be easy enough.
We should start requiring <example>
tags on new PRs for validation of fingerprints. In terms of testing, my knee-jerk thought is to just use submodules to perform testing of the other language support when this project is built or new submissions are made. However, not sure how "fun" that will be -- getting the CI platform configured to run ruby/go/java etc.
@gschneider-r7 @dabdine-r7 One simple (though I don't know how precise) technique to deal with this would be using find()
vs matches()
in the pattern API for recog-java. Though again we need to rely on coverage to test if this has unintended consequences.
@dabdine - We don't have a technical requirement for examples
for new fingerprints but I won't knowingly land a new fingerprint w/o an example.
This has been merged into github.com/RumbleDiscovery/recog (main).
Status update: We're currently looking at implementing tests for the Ruby, Java, and Go libs. Once we've sorted that out we can revisit this PR.
CC @mkienow-r7
@jvoisin With the recently added GitHub Actions you shouldn't see any future PR review delays due to cross language concerns. We appreciate the contribution and it is unfortunate this PR was left open for so long. The merge conflicts are a bit messy since subsequent PRs removed unnecessary capture groups (e.g. #351). I think we have two options to simplify moving forward with these changes:
Please let me know how you would like to proceed.
One option may be to PR the Rumble branch, since we merged this PR a while back and managed the conflicts since. The upside is it is a clean merge and maintains the Git history. The wrinkle is that is includes our other work too: https://github.com/rapid7/recog/compare/master...RumbleDiscovery:main
Feel free to go with 1.
:)
@hdm I appreciate the suggestion. I'm going to recreate these changes on our latest, but I'm interested in taking a closer look at merging your updates. That will be next!
@jvoisin Thank you for the confirmation! :)