microsoft / WSL

Issues found on WSL
https://docs.microsoft.com/windows/wsl
MIT License
17.54k stars 823 forks source link

Use pin_user_pages_fast instead of get_user_pages_fast in create_existing_sysmem(drivers/hv/dxgkrnl/dxgvmbus.c) #11095

Open langyuxf opened 10 months ago

langyuxf commented 10 months ago

Windows Version

10.0.22631.3085

WSL Version

2.0.9.0

Are you using WSL 1 or WSL 2?

Kernel Version

5.15.133.1-1

Distro Version

Ubuntu 22.04

Other Software

No response

Repro Steps

1, use mmap to allocate a PRIVATE + ANONYMOUS CPU memory 2, use dxgkio_create_allocation and the CPU memory to create a GPU allocation 3, fork a child process, then write to this memory from CPU side in parent process

Expected Behavior

The backing store physical pages of CPU VA in parent process should be still mapped in GPU page table.

Actual Behavior

Due to fork and write from parent process, copy-on-write is triggered, new physical pages are allocated for parent process which are not mapped in GPU page table.

Diagnostic Logs

No response

ghost commented 10 months ago

Word from the source is that this is unsupported by design. (Forking a process using a vGPU that is)

langyuxf commented 10 months ago

Can you elaborate on this? What is unsupported?

langyuxf commented 10 months ago

Hi @pmartincic ,

I'm not forking a process(child process) using a vGPU, I mean the parent process should still be able to use the vGPU. But now the parent process can't use vGPU when using get_userpages*.

Regards, Lang

ghost commented 10 months ago

Ahhh, I'm sorry I misread what you wrote. That's my fault. I'll start the conversation internally again.

langyuxf commented 10 months ago

Please see https://github.com/torvalds/linux/blob/master/Documentation/core-api/pin_user_pages.rst#page-maybe-dma-pinned-the-whole-point-of-pinning https://github.com/torvalds/linux/blob/master/Documentation/core-api/pin_user_pages.rst#another-way-of-thinking-about-foll-get-foll-pin-and-foll-longterm https://github.com/torvalds/linux/blob/master/include/linux/mm.h#L1967

langyuxf commented 10 months ago

Hi @pmartincic,

Any updates?

Thanks, Lang

ghost commented 9 months ago

Do you have a sample/repro/minimal app/code sample you can post? I don't work with/on these features normally, is a little out of my wheelhouse and I want to perform a sanity check before I pass it along.

(Edit: they agree the parent process should still work)

langyuxf commented 9 months ago

It's hard to put all sample codes here to reproduce the issue.

What do you want to check?

langyuxf commented 9 months ago

Hi @pmartincic,

Have you guys seen these docs?

https://github.com/torvalds/linux/blob/master/Documentation/core-api/pin_user_pages.rst#page-maybe-dma-pinned-the-whole-point-of-pinning https://github.com/torvalds/linux/blob/master/Documentation/core-api/pin_user_pages.rst#another-way-of-thinking-about-foll-get-foll-pin-and-foll-longterm https://github.com/torvalds/linux/blob/master/include/linux/mm.h#L1967

Especially, the following

/*
 * This should most likely only be called during fork() to see whether we
 * should break the cow immediately for an anon page on the src mm.
 *
 * The caller has to hold the PT lock and the vma->vm_mm->->write_protect_seq.
 */

static inline bool folio_needs_cow_for_dma(struct vm_area_struct *vma,
                      struct folio *folio)
{
    VM_BUG_ON(!(raw_read_seqcount(&vma->vm_mm->write_protect_seq) & 1));

    if (!test_bit(MMF_HAS_PINNED, &vma->vm_mm->flags))
        return false;

    return folio_maybe_dma_pinned(folio);
}

Regards, Lang

ghost commented 9 months ago

Trust but verify.

I don't work on this area of the product. My plate is pretty full. I don't have time to write a repro. Somebody who does know this area is willing to look at it. I have a limited amount of political capital. This person helping owes me nothing and it's not their responsibility to do what I ask. I will pass this bug along to them if you give a repro that I can sanity check.

langyuxf commented 9 months ago

I'm working on some closed source work on AMD hardware and It's hard for me to expose the codes.

The bug is actually clear.

ghost commented 9 months ago

You're missing the point. I'm probably not going to dig into this. The person I'm talking to who has the capacity to, would like a sample repro.

For their sake please make one.

langyuxf commented 9 months ago

Thanks,I got you point.

But it's inconvenient for me to expose a sample to reproduce which includes some closed source parts.

Can I talk with that person?If they have some questions.

langyuxf commented 9 months ago

I create a repro demo and hide hardware driver specific details. You guys must use AMD gfx9 GPU and above.

https://github.com/langyuxf/dxgkrnl_linux.git

Thanks Lang

langyuxf commented 9 months ago

Hi @pmartincic @iourit,

We observed similar issues under memory pressure. Pages acquired via get_user_pages_fast() are reclaimed by linux memory management system while GPU page tables are not updated accordingly. Then the GPU still operates on the original pages.

Have you guys reproduced the issue? Thanks.

Regards, Lang

benhillis commented 7 months ago

This should be fixed with https://github.com/microsoft/WSL/releases/tag/2.2.3.