ufrisk / pcileech

Direct Memory Access (DMA) Attack Software
GNU Affero General Public License v3.0
4.87k stars 718 forks source link

[Question] Dll library cpu usage #53

Closed false closed 6 years ago

false commented 6 years ago

Hello,

I am having bad cpu usage performance (80% cpu timings/usage after profiling) with the Mem:Read function. I call the method a lot as I need to refresh my data like all the time. The read data are most of the time quite small. Like 4 bytes, sometimes 12, but rarely more. Is there anything I could do optimize this CPU usage ? I have tried to unset the nocache flag the most often I could, but I have some data that need to be refreshed pretty often. How does the nocache flag work exactly by the way ? Does the read data sometimes get updated ? Does it detect change or ?

Do you think I can do something about it ? any optimization maybe ? Or maybe there is somewhere in the sources I can have a look into for my use case ?

hrt commented 6 years ago

Are you sure it is the mem read function? I haven't used the DLL myself but having loops without any kind of sleep will use large % of CPU regardless of what you do.

false commented 6 years ago

Well, yes, my cpu usage focuses on these read calls. I managed to optimize them a bit by reading big chunks at a time then access directly the elements through memory. But it's still not perfect. I think I can do something with the cache options though. I am currently testing if I can get something quite good out of the option "PCILEECH_VMM_CONFIG_READCACHE_TICKS" .

hrt commented 6 years ago

Pretty sure there's a good reason there are some delays (hoping someone will tell me why) but I changed the config: DELAY_READ to 0 in (for me) devicefpga.c code runs about 3 times faster. I've also tried caching virtual to physical mappings myself (I'm not using the dll) but surprisingly it was more of a performance hit than boost. Might re-try the caching though

hrt commented 6 years ago

update : caching efficiently further cut my speed by a factor of at least 6 I'm doing quite a lot of reads in 20ms using pciescreamer

but there are less error checks (cache is made to refresh the mapping for that virtual address every 2000 times it is accessed) -> room for error

hrt commented 6 years ago
QWORD VirtualToPhysical(_Inout_ PPCILEECH_CONTEXT ctx, QWORD virtualAddress)
{
  QWORD pageBase, pageSize;
  DWORD offset = virtualAddress % CACHE_SIZE;
  if (!cache[offset].physicalAddress | cache[offset].hits > RETRY_COUNT | !cache[offset].result)
  {
    cache[offset].result = Util_PageTable_Virtual2Physical(ctx, ctx->cfg->qwCR3, virtualAddress, &(cache[offset].physicalAddress), &pageBase, &pageSize);
    cache[offset].physicalAddress *= cache[offset].result; 
  }
  cache[offset].hits += 1;
  return cache[offset].physicalAddress;
}

I'm not sure how this all goes into play with the dll because I haven't touched/looked at it

ufrisk commented 6 years ago

If you read as frequently as possible the CPU usage is going to be high. One thread will be doing PCILeech work all the time, and this is not a problem if your computer have more than CPU core.

If reading a very small value as frequently as possible I would recommend using the VMM_FLAG_NOCACHE - the last thing you want is to have your value cached for a couple of seconds before it's refreshed.

PCILeech_VmmReadEx(dwPID, vaAddressToRead, pb, cb, &cbRead, VMM_FLAG_NOCACHE);

Also there is a small delay when reading from the FPGA of of approx: 500uS which is waited in a loop (consumes a lot of cpu). The reason for this is that there is no Sleep function in Windows that will sleep the thread this short amount of time.

If you wish to tune that wait period you can do so by setting the values as per below

PCILeech_DeviceConfigSet(PCILEECH_DEVICE_OPT_FPGA_DELAY_READ, <yourvalue>);
PCILeech_DeviceConfigSet(PCILEECH_DEVICE_OPT_FPGA_DELAY_WRITE, <yourvalue>);

Please note that setting very low values may result in insability and errors. But they are very conservatively set from the beginning. If reading very small amounts of memory it may even be possible to set it to zero, but you have to try it out on your system.

false commented 6 years ago

Yes, PCILeech_DeviceConfigSet(PCILEECH_DEVICE_OPT_FPGA_DELAY_READ, 1); fixed my problem to be honnest ; Or at least I can't notice any kind of bad performance anymore.

On the same kind of subject, I have noticed a big usage of memory by pcileech. Is that because you cache everything ? I did not have the time to check sources about it yet. I have about 150MB referenced under pcileech.dll in my application memory usage. Is that normal ? Anyway to optimize it as I don't use the pcileech cache at all ?

ufrisk commented 6 years ago

If you initialize the VMM the cache will always be created. This takes up quite some memory. Even if you do not use the read cache you'll use the page table cache and process cache.

But is this super important? 150MB may be big but it's not huge; and computers usually have gigabytes of memory these days ...

false commented 6 years ago

No it is not an issue at all indeed. I just wanted to make sure it was the normal behavior, not a leak or anything. Ty