This is an ancient bug that was actually attempted to be fixed once
(badly) by me eleven years ago in commit 4ceb5db9757a ("Fix
get_user_pages() race for write access") but that was then undone due to
problems on s390 by commit f33ea7f404e5 ("fix get_user_pages bug").
In the meantime, the s390 situation has long been fixed, and we can now
fix it by checking the pte_dirty() bit properly (and do it better). The
s390 dirty bit was implemented in abf09bed3cce ("s390/mm: implement
software dirty bits") which made it into v3.9. Earlier kernels will
have to look at the page state itself.
Also, the VM has become more scalable, and what used a purely
theoretical race back then has become easier to trigger.
To fix it, we introduce a new internal FOLL_COW flag to mark the "yes,
we already did a COW" rather than play racy games with FOLL_WRITE that
is very fundamental, and then use the pte dirty flag to validate that
the FOLL_COW flag is still valid.
commit 19be0eaffa3ac7d8eb6784ad9bdbc7d67ed8e619 upstream.
This is an ancient bug that was actually attempted to be fixed once (badly) by me eleven years ago in commit 4ceb5db9757a ("Fix get_user_pages() race for write access") but that was then undone due to problems on s390 by commit f33ea7f404e5 ("fix get_user_pages bug").
In the meantime, the s390 situation has long been fixed, and we can now fix it by checking the pte_dirty() bit properly (and do it better). The s390 dirty bit was implemented in abf09bed3cce ("s390/mm: implement software dirty bits") which made it into v3.9. Earlier kernels will have to look at the page state itself.
Also, the VM has become more scalable, and what used a purely theoretical race back then has become easier to trigger.
To fix it, we introduce a new internal FOLL_COW flag to mark the "yes, we already did a COW" rather than play racy games with FOLL_WRITE that is very fundamental, and then use the pte dirty flag to validate that the FOLL_COW flag is still valid.
Reported-and-tested-by: Phil "not Paul" Oester kernel@linuxace.com Acked-by: Hugh Dickins hughd@google.com Reviewed-by: Michal Hocko mhocko@suse.com Cc: Andy Lutomirski luto@kernel.org Cc: Kees Cook keescook@chromium.org Cc: Oleg Nesterov oleg@redhat.com Cc: Willy Tarreau w@1wt.eu Cc: Nick Piggin npiggin@gmail.com Cc: Greg Thelen gthelen@google.com Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org [wt: s/gup.c/memory.c; s/follow_page_pte/follow_page_mask; s/faultin_page/__get_user_page] Signed-off-by: Willy Tarreau w@1wt.eu