OpenXiangShan / difftest

Modern co-simulation framework for RISC-V CPUs
Mulan Permissive Software License, Version 2
117 stars 66 forks source link

Support for Non-Contiguous Memory Addresses in Image Files (--image=FILE) #501

Closed reoLantern closed 22 hours ago

reoLantern commented 1 day ago

Related component: simulation framework

Hello,

I am using difftest and have some questions regarding the support for input image files provided via the --image=FILE option.

From my understanding of the source code, difftest currently loads the image file into RAM starting at address 0x80000000, assuming a contiguous block of memory. The image is expected to be in a binary (.bin) format, which corresponds to a flat binary file without any address metadata.

In scenarios where we have code and data located at non-contiguous addresses—for example, code starting at 0x80000000 and data starting at 0x90000000—the binary file would need to include zero-filled padding to cover the gap between these addresses. This results in a large file size due to the padding, which is not efficient.

Could you please confirm if my understanding is correct? Does difftest currently only support this type of input format where the image must represent a contiguous block of memory starting at 0x80000000? If so, are there plans to support more flexible image formats, such as ELF files, that can handle non-contiguous memory regions without unnecessary padding?

Thank you for your time and assistance.

poemonsense commented 1 day ago

In scenarios where we have code and data located at non-contiguous addresses—for example, code starting at 0x80000000 and data starting at 0x90000000—the binary file would need to include zero-filled padding to cover the gap between these addresses. This results in a large file size due to the padding, which is not efficient. Could you please confirm if my understanding is correct?

Correct.

Does difftest currently only support this type of input format where the image must represent a contiguous block of memory starting at 0x80000000?

No. We are also supporting the ELF format. I believe it is not exactly what you want, but it is the only public standard we have to define the data sections you want. Preparing ELFs manually is not easy, but, for most scenarios, users are creating the workloads through compilers, where ELFs naturally exist.

If so, are there plans to support more flexible image formats, such as ELF files, that can handle non-contiguous memory regions without unnecessary padding?

I think the question now is that we don't explicitly document it anywhere for the ELF support to let users know that we do support it. I'll update the README now.

reoLantern commented 19 hours ago

Thank you for the clarification on difftest's support for ELF files. After further reviewing the source code, I found that difftest indeed handles ELF files by initializing multiple sections through mmap to create RAM. The memory created by mmap has the advantage of being allocated in physical memory only when accessed.

On the other hand, the REF model (e.g., Spike) also defines a mem_t structure to manage memory in a similar way, creating new pages only when an address is accessed for the first time.

However, when difftest first executes the REF, it uses proxy->mem_init(0x80000000, ram, img_size, DUT_TO_REF) to copy the entire contents of ram to the REF’s memory. This effectively calls the following function (defined in OpenXiangShan's riscv-isa-sim):

void DifftestRef::memcpy_from_dut(reg_t dest, void* src, size_t n) {
  while (n) {
    char *base = sim->addr_to_mem(dest);
    size_t n_bytes = (n > PGSIZE) ? PGSIZE : n;
    memcpy(base, src, n_bytes);
    dest += PGSIZE;
    src = (char *)src + PGSIZE;
    n -= n_bytes;
  }
}

Here, src is the RAM initialized by mmap in the MmapMemory constructor. During this copy process, uninitialized sections of the ELF that were previously sparse in RAM are now copied as zero-filled pages into the REF’s memory, effectively occupying physical memory in both ram and the REF model.

Is my understanding correct that this behavior results in zero-initialized pages being fully allocated in both ram and the REF, even if those pages weren’t originally part of the initialized sections in the ELF file?

Additionally, in the NEMU repository, it seems that difftest_memcpy has an option to use SparseRam, which appears intended to avoid copying uninitialized sections. However, I am unable to locate where the mmap-created pointer object is correctly converted to SparseRam (in sparse_mem_copy, this is attempted with auto s = (SparseRam*)src; s->copy(d);, but this casting appears problematic).

Could you clarify if SparseRam integration is fully supported in difftest or if further configuration is required to ensure that only initialized sections are copied and allocated in both ram and REF memory?

Thank you!

poemonsense commented 19 hours ago

Here, src is the RAM initialized by mmap in the MmapMemory constructor. During this copy process, uninitialized sections of the ELF that were previously sparse in RAM are now copied as zero-filled pages into the REF’s memory, effectively occupying physical memory in both ram and the REF model.

Is my understanding correct that this behavior results in zero-initialized pages being fully allocated in both ram and the REF, even if those pages weren’t originally part of the initialized sections in the ELF file?

The ram variable in difftest is not allocated with more pages, because these zero pages are read and never written. Reading zero pages do not cause page allocation in the OS.

The Spike ram is allocated because we don't skip the zeros. This can be optimized by skipping the zeros (see the nemu_large_memcpy function at https://github.com/OpenXiangShan/NEMU/blob/master/src/cpu/difftest/ref.c#L28).

Additionally, in the NEMU repository, it seems that difftest_memcpy has an option to use SparseRam, which appears intended to avoid copying uninitialized sections. However, I am unable to locate where the mmap-created pointer object is correctly converted to SparseRam (in sparse_mem_copy, this is attempted with auto s = (SparseRam*)src; s->copy(d);, but this casting appears problematic).

Could you clarify if SparseRam integration is fully supported in difftest or if further configuration is required to ensure that only initialized sections are copied and allocated in both ram and REF memory?

We are using the nemu_large_memcpy function to avoid copying uninitialized sections.

The SparseMem implementation is dirty and we will not use it at any time. It is a legacy support for the ELF format and we don't see its necessity. Could you please explain more on the advantages of SparseMem against mmap memory?