Closed vegardvaage closed 7 years ago
Interestingly the AnalyzeClientMemory flow using for instance pslist works.
Hm so what's happening here is that GRR is "inactive" for too long and gets killed by the watchdog before it can finish the memory collection. Acquiring memory might take a long time (how much memory does that machine have?) but we should not consider the time spent there as being unresponsive.
It could be that the Rekall plugin is not calling report progress properly but it could also be an actual issue where Rekall gets stuck somewhere.
Can you maybe run rekall directly on that machine? All we do is run the aff4acquire plugin, you could just test that manually.
I installed a separate rekall instance (virtualenv) and ran aff4acquire from there, and there is some weirdness:
[1] Live(/proc/kcore) 08:37:11>
[1] Live(/proc/kcore) 08:37:40> plugins.aff4acquire(destination="/tmp/test/foo.aff4")
Will use compression: https://github.com/google/snappy
Will load physical address space from live plugin.
Imaging Physical Memory:: Merging Address Ranges 0x1000 \
Reading 8169MiB / 8319MiB 9.79719664611e-05 MiB/s
It slows down a lot at the end of the dump (not seen it finish yet). This is on an Amazon AWS EC2 m4.large machine, could this be AWS related?
Hm seems like a Rekall issue, could totally be related to running on AWS. @scudette might know more about running Rekall in this environment?
There are issues for memory acquisition with AWS. See this for a thorough analysis:
https://lists.sans.org/mailman/private/dfir/2016-August/037716.html
On 24/03/2017, Andreas Moser notifications@github.com wrote:
Hm seems like a Rekall issue, could totally be related to running on AWS. @scudette might know more about running Rekall in this environment?
-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/google/grr/issues/484#issuecomment-288971177
@scudette, thanks. This may be a better fit for a rekall issue, but I see that LiME has had similar problems as well, but it looks like they found a way to fix them, see https://github.com/504ensicsLabs/LiME/issues/16 . I managed to get a successful memory capture using a LiME kernel module using the most recent version. Maybe there's a way to adapt or learn from the LiME fix?
This is also a pure Rekall issue, I'll close this on the GRR side. @vegardvaage if this is still not working for you, please file an issue with Rekall instead, thanks!
Hi! I've probably missed something important, but I'm trying to figure out why I can't make memory capture work. I've set up a brand new Ubuntu 16.04 install of GRR using the Ubuntu install script, and deployed a client to the same machine. Other flows work well, but whenever I try capturing memory I get the following error in the web console:
Traceback (most recent call last): File "/usr/share/grr-server/local/lib/python2.7/site-packages/grr/lib/flow_runner.py", line 613, in RunStateMethod responses=responses) File "/usr/share/grr-server/local/lib/python2.7/site-packages/grr/lib/flow.py", line 353, in Decorated res = f(*args[:f.func_code.co_argcount]) File "/usr/share/grr-server/local/lib/python2.7/site-packages/grr/lib/flows/general/memory.py", line 93, in CheckAnalyzeClientMemory raise flow.FlowError("Unable to image memory: %s." % responses.status) FlowError: Unable to image memory: message GrrStatus { child_session_id : SessionID: aff4:/C.064b13b39819533e/flows/F:E720B765/F:7CABCA88 cpu_time_used : message CpuSeconds { system_cpu_time : 0.0599999986589 user_cpu_time : 1.16999995708 } error_message : u'Traceback (most recent call last):\n File "/usr/share/grr-server/local/lib/python2.7/site-packages/grr/lib/flow_runner.py", line 613, in RunStateMethod\n responses=responses)\n File "/usr/share/grr-server/local/lib/python2.7/site-packages/grr/lib/flow.py", line 353, in Decorated\n res = f(*args[:f.func_code.co_argcount])\n File "/usr/share/grr-server/local/lib/python2.7/site-packages/grr/lib/flows/general/memory.py", line 270, in End\n raise flow.FlowError("Error running plugins: %s" % all_errors)\nFlowError: Error running plugins: Client killed during transaction\n' network_bytes_sent : 522 status : GENERIC_ERROR }.
In the client log I see the following:
When running the grrd manually I've sometimes seen actual memory capture progress logged as well, but the process is always killed by the nanny thread.
My understanding is that memory drivers are transferred automatically to the client, is that where I'm getting it wrong?
My thanks for any help!