wordfence / wordfence-cli

Wordfence malware and vulnerability scanner command line utility.
https://www.wordfence.com/products/wordfence-cli/
GNU General Public License v3.0
100 stars 20 forks source link

Automated remediation #110

Closed akenion closed 8 months ago

akenion commented 10 months ago

Implement automated remediation of infected files

akenion commented 8 months ago

@davidnuzik reported the following error when attempting to remediate a relative path:

Traceback (most recent call last):
  File "main.py", line 4, in <module>
  File "wordfence/cli/cli.py", line 173, in main
  File "wordfence/cli/cli.py", line 43, in process_exception
  File "wordfence/cli/cli.py", line 171, in main
  File "wordfence/cli/cli.py", line 164, in invoke
  File "wordfence/cli/remediate/remediate.py", line 25, in invoke
  File "wordfence/cli/remediate/remediate.py", line 16, in process_path
  File "wordfence/wordpress/remediator.py", line 82, in remediate
  File "wordfence/wordpress/identifier.py", line 167, in identify
  File "wordfence/wordpress/identifier.py", line 171, in identify
  File "pathlib.py", line 884, in relative_to
ValueError: 'demohackedsite_remediate/kali/Desktop/demosite/wp-content/themes/twentytwentytwo/functions.php' does not start with '/home/david/malware-sets-for-testing-cli/demohackedsite_remediate/kali/Desktop/demosite/wp-content/themes/twentytwentytwo'
davidnuzik commented 8 months ago

Reopening after testing remediation with rc2 and rc3 and our discussion:

  1. wordfence remediate (no path or args passed) behaves differently compared to vuln-scan and malware-scan those two subcommands mention errors when a path is not passed. Remediate does not and mentions "0 file(s) were successfully remediated". We should probably tweak this to behave like vuln-scan and malware-scan by providing a single line "Error" (in quotes here because it's just standard output not necessarily a "log" or log level output) message perhaps something like "Error: At least one path must be specified".
  2. I am able to cause infinite loops again ("infinite" until errors out). If I have a Wordpress directory that has a symlink pointing to itself (meaning this dir or .) or parent above (..), etc the remediate function keeps looping through repeating remediation. Please let me know if you'd like an example or anything.

Thanks!

ewodrich commented 8 months ago

While attempting remediation on an entire wordpress file system about half of the files were remediated. It appears to be a bug when specifying entire directories, per chat with Alex.

Copy of files output to error log attached. errors_rmd2024-01-10-15-24-39.txt

davidnuzik commented 8 months ago

Regarding some changes made that may impact malware-scan performance, I posted internally on our company slack my findings, but all in all the differences were minimal and while the change may use slightly more memory (perhaps about 2-4% more based on my tests) it seems to actually result in slightly faster total runtimes for scans (at least after the first malware scan, which is normal).

Compared to my tests against RC3, RC5 is approximately 1-3% faster as far as total runtime with malware-scans at the expense of about the same percentage memory (more utilization).

My testing was not perfect and was synthetic and not completely controlled but I did encounter very close consistency from runs (after 1st run / caching). I only tested on amd64 arch, with an nvme ssd, in a Linux environment, and there are many more gotchas but I think the important thing is my runs were quite consistent from one to another (i.e. running a test with rc3 with same command more than a couple times). Hope this all makes sense.

If time permits I would like to test with a much larger sample on bare metal which I can do myself perhaps Monday evening. A much larger sample may have different results as I saw memory utilization increased a bit more than I'd expect from my large sample to one about 4x in size (on disk and file count) -- thus it's possible a very very large sample (a million files or more) might use more memory than expected with the change -- probably not that much more but it's somewhat possible it could be a lot.

EDIT: over the weekend scanned nearly 1 million files, ~120GB on disk and saw no significant change here with memory utilization or runtime, perhaps as much as 150-200MB more memory was used for this run, but based on prior tests runs, and how rc3 behaved as well (control) this doesn't seem bad at all. All in all, not concerned over resource utilization and runtime with the change(s) in RC5.

akenion commented 8 months ago

I noticed formatting issues with the help output introduced by the extended description the remediate subcommand, so I went ahead and corrected those in the PR above.

davidnuzik commented 8 months ago

v3.0.1rc5

Reference: https://github.com/wordfence/wordfence-cli/issues/110#issuecomment-1887534764

I can confirm (tested on Linux) that all expected files get remediated now (at least with a basic wp install / core and default included plugins and themes). A few inconsequential files, do not get remediated because they are NOT in the themes repo. I would generally expect this behavior and regardless the report will mention any files not remediated -- typically they would have a status of "unrecognized" or some possibly "unidentified". These are also not php,js or other files that are more prone to have malware.

davidnuzik commented 8 months ago

v3.0.1rc6 1/15/24

SUMMARY: QA validation PASSED. I can confirm that the new remediate subcommand works as expected including any issues mentioned prior. I was successfully able to remediate just about any Wordpress core file, theme, plugin I could find to test -- remediate now works across the board.

VALIDATION STEPS

  1. wordfence remediate {path/file} - remediate a specific known file, such as index.php in the root of a WP install. This works without issue.
  2. Test again but with a relative path provided, no issue.
  3. Test again but point to a symlink that points to a known file, no issue.
  4. Test again but pass an entire directory, such as ./plugins/akismet/ -- all known files are remediated some expected files do not remediate like .htaccess -- this is not in the Wordpress SVN repo and so it is expected to not be restored.
  5. Test again but pass an entire Wordpress install (NOT RECOMMENDED, only infected files should be remediated and after you make a backup). This works -- all known files are remediated -- I got the contents of the WP SVN repo (file list) and ensured that all that were NOT remediated were NOT in this list. That is to say I confirmed that all expected/known files actually did get remediated. A few files such as inconsequential dotfiles in themes did NOT get remediated. This is known/expected currently and these files are at little risk of malware -- and output still indicates what does / does not get remediated. (This including some testing of files that I created and I tried remediating them - as expected the CLI mentions the file is not known and it is NOT touched.
  6. Test again but test all output options (including all --output-path and related --output* options) -- no issues. The status column is accurate.
  7. Various additional testing including but not limited to: ensure CLI help content is good and makes sense, ensure works in various supported envrionements both amd64 and arm64, test with various install methods including the new RPM (tracked separately in another issue)., and more.

NOTES:

akenion commented 8 months ago

Per discussion with @barmat the above PR adds a --output-unremediated / -u option that causes the output to only include paths that were not remediated. This can be used, for example, to delete files that cannot be restored.

davidnuzik commented 8 months ago

v3.0.1rc7 1/16/24

I can confirm the new --output-unremediated / -u option works as expected. Human readable and --output-path (file output) both behave as expected only including any files that are not remediated.

I also executed some smoke tests related to my prior comment to ensure remediate still behaves as expected and found no issues.

ewodrich commented 8 months ago

Reopening for --output-unremediated counts are only showing the number of unsuccessful files are being counted rather than the complete remediated and unremediated counts. ex. wordfence remediate -u akismet results in 0 of 1 file(s) were successfully remediated, 1 file(s) could not be remediated

akenion commented 8 months ago

The remediate command needs an --allow-io-errors flag that will behave in the same manner as vuln-scan and count-sites.

akenion commented 8 months ago

Thinking through this further, there doesn't appear to be a use case for having the flag for the remediate command. We just need it to always behave the way vuln-scan and count-sites do with --allow-io-errors enabled.