FreeBSDDesktop / DEPRECATED-freebsd-base-graphics

Fork of FreeBSD's base repository to work on graphics-stack-related projects
Other
49 stars 13 forks source link

[patch] eduke32 freezes machine #115

Closed yanko-yankulov closed 7 years ago

yanko-yankulov commented 7 years ago

Hi,

I was playing around with different GL applications and came across eduke32. Running it on current drm-next leads to the system becoming unresponsive after a few seconds to a few minutes of play. The syslog was filling up with failed I915_GEM_CREATE with EAGAIN(-35) errors.

I was able to track it down to a problem in linux_idr (pretty sure about that) and possibly vnode leak (not sure at all).

So about the linux_idr, patch 1

From 9a0ad66e842176eec5df2c971414771255763eb9 Mon Sep 17 00:00:00 2001
From: Yanko Yankulov <yanko.yankulov@gmail.com>
Date: Mon, 6 Feb 2017 22:36:38 +0200
Subject: [PATCH 1/3] zero cached li's first element

We are using the first element to hold the next preallocated idr_layer. However on return from idr_get() all ->ary elements must be zeroed, as indicated by the ->bitmap. So zero it, else we are holding reference to memory that we don't own.
---
 sys/compat/linuxkpi/common/src/linux_idr.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/sys/compat/linuxkpi/common/src/linux_idr.c b/sys/compat/linuxkpi/common/src/linux_idr.c
index 99e20593374..032fa30e87a 100644
--- a/sys/compat/linuxkpi/common/src/linux_idr.c
+++ b/sys/compat/linuxkpi/common/src/linux_idr.c
@@ -331,6 +331,7 @@ idr_get(struct idr *idp)
                if ((il = DPCPU_GET(idr_preload_head)) != NULL) {
                        DPCPU_SET(idr_preload_head, il->ary[0]);
                        DPCPU_SET(idr_preload_cnt, DPCPU_GET(idr_preload_cnt) - 1);
+                       il->ary[0] = NULL;
                }
        }
 done:
-- 
2.11.0

This is all dead code at the moment, as the next patch will show. So this is why this has gone unnoticed.

Patch 2:

From f0d542d5207866185b539993d2b682ece63fcd82 Mon Sep 17 00:00:00 2001
From: Yanko Yankulov <yanko.yankulov@gmail.com>
Date: Mon, 6 Feb 2017 22:38:21 +0200
Subject: [PATCH 2/3] linux_idr: actually use preloaded ils

If idr_get is called in a idr_preload/idr_preload_end block, we are executing in a critical section. So in_intertupt() will always return true and we will not use the preallocated idr_layer structs. When we are in critical section we can safely access the percpu variables and use the preallocated structs. So do that and fall back to  calling lkpi_malloc only if we have nothing in the cache
---
 sys/compat/linuxkpi/common/src/linux_idr.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/sys/compat/linuxkpi/common/src/linux_idr.c b/sys/compat/linuxkpi/common/src/linux_idr.c
index 032fa30e87a..a1217b67552 100644
--- a/sys/compat/linuxkpi/common/src/linux_idr.c
+++ b/sys/compat/linuxkpi/common/src/linux_idr.c
@@ -325,16 +325,16 @@ idr_get(struct idr *idp)
                MPASS(ffsl(il->bitmap) != 0);
                return (il);
        }
-       if ((il = lkpi_malloc(sizeof(*il), M_IDR, M_ZERO | M_NOWAIT)) != NULL)
-               goto done;
-       if (!in_interrupt()) {
+       if ( curthread->td_critnest ) {
                if ((il = DPCPU_GET(idr_preload_head)) != NULL) {
                        DPCPU_SET(idr_preload_head, il->ary[0]);
                        DPCPU_SET(idr_preload_cnt, DPCPU_GET(idr_preload_cnt) - 1);
                        il->ary[0] = NULL;
                }
        }
-done:
+       if( !il )
+               il = lkpi_malloc(sizeof(*il), M_IDR, M_ZERO | M_NOWAIT);
+
        if (il != NULL)
                bitmap_fill(&il->bitmap, IDR_SIZE);
        return (il);
-- 
2.11.0

With those two patches I am able to play with no immediate adversary effects. They also touch/depend only on the idr code so I am pretty confident about them. This leaves the question opened as to why was lkpi_malloc continously failing. I have no clue on that.

Third proposed patch to be taken with many grains of salt :). Without the above patches the system was running out of vnodes very fast. The few crashdumps I was able to collect all had the eduke32 process waiting in getnewvnode & co called from shmem_file_setup. With the patches applied I am still seeing constantly increasing vfs.numvnodes albeit at much slower rate. This got me searching to where actually the vnodes get freed, and was unable to locate such a place. The fput function that will destroy linux_file struct just frees it as it has no actual file attached to it. Thus the third proposed patch:

From e55a81ca990897424ac26e715033b57a1348b445 Mon Sep 17 00:00:00 2001
From: Yanko Yankulov <yanko.yankulov@gmail.com>
Date: Mon, 6 Feb 2017 22:41:48 +0200
Subject: [PATCH 3/3] free shmem_file_setup vnodes

Mark and drop the private vnode allocated in shmem_file_setup when destroying the linux_file struct
---
 sys/compat/linuxkpi/common/include/linux/file.h | 5 +++++
 sys/compat/linuxkpi/common/include/linux/fs.h   | 2 ++
 sys/compat/linuxkpi/common/src/linux_fs.c       | 2 ++
 3 files changed, 9 insertions(+)

diff --git a/sys/compat/linuxkpi/common/include/linux/file.h b/sys/compat/linuxkpi/common/include/linux/file.h
index 559ac043768..edf08526c96 100644
--- a/sys/compat/linuxkpi/common/include/linux/file.h
+++ b/sys/compat/linuxkpi/common/include/linux/file.h
@@ -63,6 +63,11 @@ static inline void
 fput(struct linux_file *filp)
 {
        if (filp->_file == NULL) {
+               if( filp->drop_vnode ) {
+                       file_inode(filp)->i_mapping = NULL;
+                       vgone(filp->f_vnode);
+                       vrele(filp->f_vnode);
+               }
                kfree(filp);
                return;
        }
diff --git a/sys/compat/linuxkpi/common/include/linux/fs.h b/sys/compat/linuxkpi/common/include/linux/fs.h
index f2592e90b3f..b730585efa3 100644
--- a/sys/compat/linuxkpi/common/include/linux/fs.h
+++ b/sys/compat/linuxkpi/common/include/linux/fs.h
@@ -122,6 +122,8 @@ struct linux_file {
        struct filterops *f_kqfiltops;
        /* protects f_sigio.si_note and f_entry */
        spinlock_t      f_lock;
+
+       int drop_vnode;
 };
 #define f_inode                f_vnode
 #define        file            linux_file
diff --git a/sys/compat/linuxkpi/common/src/linux_fs.c b/sys/compat/linuxkpi/common/src/linux_fs.c
index 7cdb03622ac..af93d0fa81c 100644
--- a/sys/compat/linuxkpi/common/src/linux_fs.c
+++ b/sys/compat/linuxkpi/common/src/linux_fs.c
@@ -329,6 +329,8 @@ shmem_file_setup(char *name, int size, int flags)
                error = -ENOMEM;
                goto err_2;
        }
+
+       filp->drop_vnode = 1;

        return (filp);
 err_2:
-- 
2.11.0

It marks the linux_file struct as owning the vnode and frees it in the fput. With it the vnodes were under control even without the first two patches and I have not observed any regression so far. However I am totally out of my depths here - I know absolutely nothing about the FreeBSD vfs so don't believe me for this last patch. Also I am pretty curious as to what is actually happening with the vm_pager_allocate'd mapping associated with the file. I have not yet located where it would get freed, but there is quite a lot of code there. I also haven't noticed a substantial memory depletion, but maybe I haven't looked hard enough.

Another small point, that I have noticed but have not verified. On the failing path in shmem_file_setup we would call _vdrop to free the freshly allocated vnode. However the vnode has no mountpoint and _vdrop would actually try to use vp->v_mount and cause a NULL pointer derefernce. I may be wrong on this one too, as I haven't checked it specifically, but trying to use _vdrop in the if( filp->drop_vnode ) path of fput triggered it.

Looking forward to any comments and to a few hours of happiness replaying this GEM (pun intended) :)

yanko-yankulov commented 7 years ago

Oh and for anyone willing to try it out don't forget LD_PRELOAD=libGL.so for the moment.

yanko-yankulov commented 7 years ago

After sleeping it over I realised that I am breaking the idr with patch2. We really can't access the pcpu cache from an interupt. Will post new version later today.

yanko-yankulov commented 7 years ago

Updated patch 2

From cd0222895b32ea2d309c0211739054acc0cd727d Mon Sep 17 00:00:00 2001
From: Yanko Yankulov <yanko.yankulov@gmail.com>
Date: Tue, 7 Feb 2017 12:31:20 +0200
Subject: [PATCH 2/3] linux_idr: actually use preloaded ils

If we are calling in a idr_preload/idr_preload_end block, we are executing in critical section. So in_intertupt will always return true and we will not use the preallocated idr_layer structs. Change the condition to explicitly check that we are not in interrupt, and that we are in critical section. If we are not in a critcial section we were not called from idr_preload/idr_preload_end block, so the preloaded elements aren't ours.
---
 sys/compat/linuxkpi/common/src/linux_idr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sys/compat/linuxkpi/common/src/linux_idr.c b/sys/compat/linuxkpi/common/src/linux_idr.c
index 032fa30e87a..9112447426d 100644
--- a/sys/compat/linuxkpi/common/src/linux_idr.c
+++ b/sys/compat/linuxkpi/common/src/linux_idr.c
@@ -327,7 +327,7 @@ idr_get(struct idr *idp)
        }
        if ((il = lkpi_malloc(sizeof(*il), M_IDR, M_ZERO | M_NOWAIT)) != NULL)
                goto done;
-       if (!in_interrupt()) {
+       if (!curthread->td_intr_nesting_level && curthread->td_critnest) {
                if ((il = DPCPU_GET(idr_preload_head)) != NULL) {
                        DPCPU_SET(idr_preload_head, il->ary[0]);
                        DPCPU_SET(idr_preload_cnt, DPCPU_GET(idr_preload_cnt) - 1);
-- 
2.11.0

I have also restored the original ordering ( the same as in Linux ). There is a comment in the Linux code explaining why this ordering is better.

hselasky commented 7 years ago

Hi,

Can you re-test with the latest drm-next branch? There has been significant changes in this area.

--HPS

yanko-yankulov commented 7 years ago

Except for the warning/error:

/sys/modules/linuxkpi/../../compat/linuxkpi/common/src/linux_fs.c:495:1: error: no previous prototype for function 'linux_file_free' [-Werror,-Wmissing-prototypes]
linux_file_free(struct linux_file *filp)

Everything seems perfect for the moment. Thanks.

I will close the ticket as it seems all issues mentioned are resolved. Thanks again.

hselasky commented 7 years ago

Thank you! I've just put a fix for the missing prototype in.

Thank you.

I was using WERROR="" when compiling :-)

--HPS

yanko-yankulov commented 7 years ago

Hi all, @hselasky ,

I am reopening this as unfortunately the vnodes don't actually get freed. This can easily be observed for example with ./gem_ctx_thrash test from intel-gpu-tools @markjdb has ported. Running the test causes the numvnodes to constantly increase rendering the machine unusable in the end.

I believe I have the reasons figured out and I am proposing a set of patches that seem to fix this. With the patches applied the ./gem_ctx_thrash passes. Looking forward to any comments or objections :)

1) It seems we really need vrele instead of vdrop, as the vdrop will not decrease the v_usecount, and leave the vnode unusable.

2) The modification in _vdrop to handle the no mountpoint case would actually forget the vnode. The mountpoint case will move it to the per mount freelist and then batch free it later. So as we don't have have a mountpoint I have put in code to add the vnode to freelist in similar maner.

With those two patches everything is OK. However I believe it would actually be better to vgone the vnodes in question before freeing them. We can't really reuse them as they have nothing associated with them, so no point in keeping them in the freelist, and waiting for a recycle to vgone them.

With this last change the new path in the _vdrop code would not get hit, but I am not quite sure whether we want to keep the changes or not.

So the proposed patches:

From 79c13df499bd95c0382aa936fce1d847961b7eb3 Mon Sep 17 00:00:00 2001
From: Yanko Yankulov <yanko.yankulov@gmail.com>
Date: Mon, 13 Feb 2017 12:46:33 +0200
Subject: [PATCH 1/3] use vrele instead of vdrop

We need to decrement the usecount too.
---
 sys/compat/linuxkpi/common/src/linux_fs.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/sys/compat/linuxkpi/common/src/linux_fs.c b/sys/compat/linuxkpi/common/src/linux_fs.c
index c1909edcde2..d08378bbb1b 100644
--- a/sys/compat/linuxkpi/common/src/linux_fs.c
+++ b/sys/compat/linuxkpi/common/src/linux_fs.c
@@ -347,7 +347,7 @@ shmem_file_setup(char *name, int size, int flags)

        return (filp);
 err_2:
-       vdrop(vp);
+       vrele(vp);
 err_1:
        kfree(filp);
 err_0:
@@ -504,7 +504,7 @@ linux_file_free(struct linux_file *filp)
                        vm_object_deallocate(vp->i_mapping);
                        vp->i_mapping = NULL;
                }
-               vdrop(vp);
+               vrele(vp);
        } else {
                _fdrop(filp->_file, curthread);
        }
-- 
2.11.0
From 81571fce974831501a68c95f02d783dcb3cc368d Mon Sep 17 00:00:00 2001
From: Yanko Yankulov <yanko.yankulov@gmail.com>
Date: Mon, 13 Feb 2017 10:20:39 +0200
Subject: [PATCH 2/3] push vnodes to freelist

---
 sys/kern/vfs_subr.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/sys/kern/vfs_subr.c b/sys/kern/vfs_subr.c
index b3df4af3658..dfa7f935919 100644
--- a/sys/kern/vfs_subr.c
+++ b/sys/kern/vfs_subr.c
@@ -2950,6 +2950,13 @@ _vdrop(struct vnode *vp, bool locked)
                                        vnlru_return_batch_locked(mp);
                                mtx_unlock(&mp->mnt_listmtx);
                        }
+                       else
+                       {
+                               mtx_lock(&vnode_free_list_mtx);
+                               TAILQ_INSERT_TAIL(&vnode_free_list, vp, v_actfreelist);
+                               freevnodes ++;
+                               mtx_unlock(&vnode_free_list_mtx);
+                       }
                } else {
                        VI_UNLOCK(vp);
                        counter_u64_add(free_owe_inact, 1);
-- 
2.11.0
From 0f5f6fca92ed677835f5650033c7ad4cfcd75437 Mon Sep 17 00:00:00 2001
From: Yanko Yankulov <yanko.yankulov@gmail.com>
Date: Mon, 13 Feb 2017 13:01:03 +0200
Subject: [PATCH 3/3] add vgone

No need to trash the vnode freelist
---
 sys/compat/linuxkpi/common/src/linux_fs.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/sys/compat/linuxkpi/common/src/linux_fs.c b/sys/compat/linuxkpi/common/src/linux_fs.c
index d08378bbb1b..df26485561d 100644
--- a/sys/compat/linuxkpi/common/src/linux_fs.c
+++ b/sys/compat/linuxkpi/common/src/linux_fs.c
@@ -347,6 +347,7 @@ shmem_file_setup(char *name, int size, int flags)

        return (filp);
 err_2:
+       vgone(vp);
        vrele(vp);
 err_1:
        kfree(filp);
@@ -504,6 +505,7 @@ linux_file_free(struct linux_file *filp)
                        vm_object_deallocate(vp->i_mapping);
                        vp->i_mapping = NULL;
                }
+               vgone(vp);
                vrele(vp);
        } else {
                _fdrop(filp->_file, curthread);
-- 
2.11.0

Best, Yanko

hselasky commented 7 years ago

Hi @yanko-yankulov ,

I pushed your patch with some minor modifications.

Can you "git pull --rebase" your drm-next and re-test?

Thank you!

--HPS

yanko-yankulov commented 7 years ago

Hi @hselasky,

Thanks.

Seems OK. The test passes and the vnodes count are back to the pre-test values.

Closing it again :).