ufrisk / MemProcFS

MemProcFS
GNU Affero General Public License v3.0
3.16k stars 383 forks source link

VMMDLL_MemReadScatter returns incorrect result (0) #30

Closed bruhov closed 4 years ago

bruhov commented 4 years ago

I may be wrong, but I will try to explain.

Step 1: Call VMMDLL_MemReadScatter with two MEMsVirt items. Each MEMsVirt has cbMax = 0x8.

Step 2: It sets cbMax to 0x1000 wheen reads physical memory: https://github.com/ufrisk/MemProcFS/blob/master/vmm/vmm.c#L1569

pIoPA->cbMax = 0x1000;

Step 3: After reading it sets cb to 0x1000 https://github.com/ufrisk/MemProcFS/blob/master/vmm/vmm.c#L1579

((PMEM_IO_SCATTER_HEADER)ppMEMsPhys[iPA]->pvReserved1)->cb = ppMEMsPhys[iPA]->cb;

Step 4: It check equality cb and cbMax to increase counter, but cb > cbMax https://github.com/ufrisk/MemProcFS/blob/master/vmm/vmmdll.c#L695

for(i = 0, cMEMs = 0; i < cpMEMs; i++) {
    if(ppMEMs[i]->cb == ppMEMs[i]->cbMax) {
        cMEMs++;
    }
}
return cMEMs;

Step 5: VMMDLL_MemReadScatter returns 0 insted of 2. pb of MEMsVirt filled with 0x1000 readed bytes instead of 0x8 (maybe it is ok).

ufrisk commented 4 years ago

The pIoPA->cbMax = 0x1000; is quite worriesome; this may lead to the pb buffer being overwritten past bounds. I have to look into this more and fix it somehow.

Generally I'd recommend using the VMMDLL_MemReadScatter with 0x1000-byte sized 0x1000-byte aligned scatter mem items only; it's the most well tested (or even the only properly tested internal paths).

Reads are latency bound anyway so unless doing 10+ reads or so in parallel having 0x1000 byte reads instead of an 8-byte read should not affect performance in any noticeable way.

I'll put this up as something I need to fix; meanwhile please use 0x1000 byte sized and aligned scatter read items.

ufrisk commented 4 years ago

I have changed how the ReadScatter works (re-designed the struct) so this should not be happening anymore.

Also, if reading on a page boundary when reading from an FPGA device it should now fail since it's not supported by PCI Express.

In general it still applies that the best way of going with things is to read page-aligned page-sized chunks of memory. There is very little performance improvement of reading smaller memory chunks.