renzhengeek / issues

0 stars 0 forks source link

commit history for ocfs2 between V2.6.16.60(SLE10SP4) and V2.6.32.12(SLE11SP1) #29

Closed renzhengeek closed 8 years ago

renzhengeek commented 8 years ago

git log --since="2008-4-30" --until="2010-4-26" --reverse fs/ocfs2/

renzhengeek commented 8 years ago

commit 24ef1815e5e13e50196eb1ab8ddc0d783443bdf8 Author: Joel Becker joel.becker@oracle.com Date: Tue Jan 29 17:37:32 2008 -0800

ocfs2: Separate out dlm lock functions.

This is the first in a series of patches to isolate ocfs2 from the
underlying cluster stack. Here we wrap the dlm locking functions with
ocfs2-specific calls. Because ocfs2 always uses the same dlm lock status
callbacks, we can eliminate the callbacks from the filesystem visible
functions.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
renzhengeek commented 8 years ago

commit de551246e7bc5558371c3427889a8db1b8cc60f4 Author: Joel Becker joel.becker@oracle.com Date: Fri Feb 1 14:45:08 2008 -0800

ocfs2: Remove CANCELGRANT from the view of dlmglue.

o2dlm has the non-standard behavior of providing a cancel callback
(unlock_ast) even when the cancel has failed (the locking operation
succeeded without canceling).  This is called CANCELGRANT after the
status code sent to the callback.  fs/dlm does not provide this
callback, so dlmglue must be changed to live without it.
o2dlm_unlock_ast_wrapper() in stackglue now ignores CANCELGRANT calls.

Because dlmglue no longer sees CANCELGRANT, ocfs2_unlock_ast() no longer
needs to check for it.  ocfs2_locking_ast() must catch that a cancel was
tried and clear the cancel state.

Making these changes opens up a locking race.  dlmglue uses the the
OCFS2_LOCK_BUSY flag to ensure only one thread is calling the dlm at any
one time.  But dlmglue must unlock the lockres before calling into the
dlm.  In the small window of time between unlocking the lockres and
calling the dlm, the downconvert thread can try to cancel the lock.  The
downconvert thread is checking the OCFS2_LOCK_BUSY flag - it doesn't
know that ocfs2_dlm_lock() has not yet been called.

Because ocfs2_dlm_lock() has not yet been called, the cancel operation
will just be a no-op.  There's nothing to cancel.  With CANCELGRANT,
dlmglue uses the CANCELGRANT callback to clear up the cancel state.
When it comes around again, it will retry the cancel.  Eventually, the
first thread will have called into ocfs2_dlm_lock(), and either the
lock or the cancel will succeed.  The downconvert thread can then do its
downconvert.

Without CANCELGRANT, there is nothing to clean up the cancellation
state.  The downconvert thread does not know to retry its operations.
More importantly, the original lock may be blocking on the other node
that is trying to cancel us.  With neither able to make progress, the
ast is never called and the cancellation state is never cleaned up that
way.  dlmglue is deadlocked.

The OCFS2_LOCK_PENDING flag is introduced to remedy this window.  It is
set at the same time OCFS2_LOCK_BUSY is.  Thus, the downconvert thread
can check whether the lock is cancelable.  If not, it just loops around
to try again.  Once ocfs2_dlm_lock() is called, the thread then clears
OCFS2_LOCK_PENDING and wakes the downconvert thread.  Now, if the
downconvert thread finds the lock BUSY, it can safely try to cancel it.
Whether the cancel works or not, the state will be properly set and the
lock processing can continue.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
renzhengeek commented 8 years ago

commit 1693a5c0117f8ccd010a666f97aaf0f14fb0a0e4 Author: David Teigland teigland@redhat.com Date: Wed Jan 30 16:52:53 2008 -0800

ocfs2: handle async EAGAIN from NOQUEUE request

When using fsdlm, -EAGAIN is returned in the async callback for NOQUEUE
requests. Fix up dlmglue to expect this.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
renzhengeek commented 8 years ago

commit 553aa7e408eac402c00b67ddfa7aec13fe1f3a33 Author: Joel Becker joel.becker@oracle.com Date: Fri Feb 1 14:51:03 2008 -0800

ocfs2: Split o2cb code from generic stack functions.

Split off the o2cb-specific funtionality from the generic stack glue
calls.  This is a precurser to wrapping the o2cb functionality in an
operations vector.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
renzhengeek commented 8 years ago

commit cf4d8d75d8aba537a19b313a9364fd08ddbd5622 Author: David Teigland teigland@redhat.com Date: Wed Feb 20 14:29:27 2008 -0800

ocfs2: add fsdlm to stackglue

Add code to use fs/dlm.

[ Modified to be part of the stack_user module -- Joel ]

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
renzhengeek commented 8 years ago

commit 8ddb7b004dfa1832a750e199df8bff4b75b73565 Author: Sunil Mushran sunil.mushran@oracle.com Date: Tue May 13 13:45:15 2008 -0700

[PATCH 2/2] ocfs2: Instrument fs cluster locks

This patch adds code to track the number of times the fs takes
various cluster locks as well as the times associated with it.
The information is made available to users via debugfs.

This patch was originally written by Jan Kara <jack@suse.cz>.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
renzhengeek commented 8 years ago

commit dd25e55ea133b14678cfaa9e205b082b24b26dbc Author: Randy Dunlap randy.dunlap@oracle.com Date: Wed May 28 14:41:00 2008 -0700

ocfs2: fix printk format warnings with OCFS2_FS_STATS=n

Fix printk format warnings when OCFS2_FS_STATS=n:

linux-next-20080528/fs/ocfs2/dlmglue.c: In function 'ocfs2_dlm_seq_show':
linux-next-20080528/fs/ocfs2/dlmglue.c:2623: warning: format '%llu' expects type 'long long unsigned int', but argument 3 has type 'int'
linux-next-20080528/fs/ocfs2/dlmglue.c:2623: warning: format '%llu' expects type 'long long unsigned int', but argument 4 has type 'int'
linux-next-20080528/fs/ocfs2/dlmglue.c:2623: warning: format '%llu' expects type 'long long unsigned int', but argument 7 has type 'int'
linux-next-20080528/fs/ocfs2/dlmglue.c:2623: warning: format '%llu' expects type 'long long unsigned int', but argument 8 has type 'int'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
renzhengeek commented 8 years ago

commit 28b8ca0b7f70b1b048d03dc0b9d87f58619e9791 Author: Tao Ma tao.ma@oracle.com Date: Mon Sep 1 08:45:18 2008 +0800

ocfs2: bug-fix for journal extend in xattr.

In ocfs2_extend_trans, when we can't extend the current
transaction, it will commit current transaction and restart
a new one. So if the previous credits we have allocated aren't
used(the block isn't dirtied before our extend), we will not
have enough credits for any future operation(it will cause jbd
complain and bug out). So check this and re-extend it.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
renzhengeek commented 8 years ago

commit da1e90985a0e767e44397c9db0937e236033fa58 Author: Joel Becker joel.becker@oracle.com Date: Thu Oct 9 17:20:29 2008 -0700

ocfs2: Separate out sync reads from ocfs2_read_blocks()

The ocfs2_read_blocks() function currently handles sync reads, cached,
reads, and sometimes cached reads.  We're going to add some
functionality to it, so first we should simplify it.  The uncached,
synchronous reads are much easer to handle as a separate function, so we
instroduce ocfs2_read_blocks_sync().

Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
renzhengeek commented 8 years ago

commit 07f9eebcdfaeefc8f807fa1bcce1d7c3ae6661b1 Author: David Teigland teigland@redhat.com Date: Mon Nov 17 12:28:48 2008 -0600

ocfs2: fix wake_up in unlock_ast

In ocfs2_unlock_ast(), call wake_up() on lockres before releasing
the spin lock on it.  As soon as the spin lock is released, the
lockres can be freed.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
renzhengeek commented 8 years ago

commit 2b83256407687613e906bee93d98a25339128a4d Author: Sunil Mushran sunil.mushran@oracle.com Date: Tue Dec 16 15:49:19 2008 -0800

ocfs2/dlm: Fix a race between migrate request and exit domain

Patch address a racing migrate request message and an exit domain message.
Instead of blocking exit domains for the duration of the migrate, we ignore
failure to deliver that message. This is because an exiting domain should
not have any active locks and thus has no role to play in the migration.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
renzhengeek commented 8 years ago

commit 57dff2676eb68d805883a2204faaa5339ac44e03 Author: Sunil Mushran sunil.mushran@oracle.com Date: Tue Dec 16 15:49:20 2008 -0800

ocfs2/dlm: Clean up errors in dlm_proxy_ast_handler()

Patch cleans printed errors in dlm_proxy_ast_handler(). The errors now includes
the node number that sent the (b)ast. Also it reduces the number of endian swaps
of the cookie.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
renzhengeek commented 8 years ago

commit d4f7e650e55af6b235871126f747da88600e8040 Author: Sunil Mushran sunil.mushran@oracle.com Date: Tue Dec 16 15:49:21 2008 -0800

ocfs2/dlm: Hold off sending lockres drop ref message while lockres is migrating

During lockres purge, o2dlm sends a drop reference message to the lockres
master. This patch delays the message if the lockres is being migrated.

Fixes oss bugzilla#1012
http://oss.oracle.com/bugzilla/show_bug.cgi?id=1012

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
renzhengeek commented 8 years ago

commit b0d4f817ba5de8adb875ace594554a96d7737710 Author: Sunil Mushran sunil.mushran@oracle.com Date: Tue Dec 16 15:49:22 2008 -0800

ocfs2/dlm: Fix race in adding/removing lockres' to/from the tracking list

This patch adds a new lock, dlm->tracking_lock, to protect adding/removing
lockres' to/from the dlm->tracking_list. We were previously using dlm->spinlock
for the same, but that proved inadequate as we could be freeing a lockres from
a context that did not hold that lock. As the new lock only protects this list,
we can explicitly take it when removing the lockres from the tracking list.

This bug was exposed when testing multiple processes concurrently flock() the
same file.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
renzhengeek commented 8 years ago

commit a641dc2a5a1445eb4cb491080dfc41c42a9eb37d Author: Mark Fasheh mfasheh@suse.com Date: Wed Dec 24 16:03:48 2008 -0800

ocfs2: remove unneeded lvb casts

dlmglue.c has lots of code which casts the return value of ocfs2_dlm_lvb().
This is pointless however, as ocfs2_dlm_lvb() returns void *.
renzhengeek commented 8 years ago

commit a4b91965d39d5d53b470d6aa62cba155a6f3ffe1 Author: Sunil Mushran sunil.mushran@oracle.com Date: Thu Jan 29 17:12:31 2009 -0800

ocfs2: Wakeup the downconvert thread after a successful cancel convert

When two nodes holding PR locks on a resource concurrently attempt to
upconvert the locks to EX, the master sends a BAST to one of the nodes. This
message tells that node to first cancel convert the upconvert request,
followed by downconvert to a NL. Only when this lock is downconverted to NL,
can the master upconvert the first node's lock to EX.

While the fs was doing the cancel convert, it was forgetting to wake up the
dc thread after a successful cancel, leading to a deadlock.

Reported-and-Tested-by: David Teigland <teigland@redhat.com>
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
renzhengeek commented 8 years ago

commit 7dc102b737e9f49dac426161294cb2d326a97d8e Author: Sunil Mushran sunil.mushran@oracle.com Date: Tue Feb 3 12:37:13 2009 -0800

ocfs2/dlm: Retract fix for race between purge and migrate

Mainline commit d4f7e650e55af6b235871126f747da88600e8040 attempts to delay
the dlm_thread from sending the drop ref message if the lockres is being
migrated. The problem is that we make the dlm_thread wait for the migration
to complete. This causes a deadlock as dlm_thread also participates in the
lockres migration process.

A better fix for the original oss bugzilla#1012 is in testing.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Acked-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
renzhengeek commented 8 years ago

commit dabc47de7a23f57522dc762d9d2ad875700d3497 Author: Sunil Mushran sunil.mushran@oracle.com Date: Tue Feb 3 12:37:15 2009 -0800

ocfs2/dlm: Use ast_lock to protect ast_list

The code was using dlm->spinlock instead of dlm->ast_lock to protect the
ast_list. This patch fixes the issue.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Acked-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
renzhengeek commented 8 years ago

commit 53ecd25e148615e0ed2a72635cc76f4773f97f90 Author: Sunil Mushran sunil.mushran@oracle.com Date: Tue Feb 3 12:37:16 2009 -0800

ocfs2/dlm: Make dlm_assert_master_handler() kill itself instead of the asserter

In dlm_assert_master_handler(), if we get an incorrect assert master from a node
that, we reply with EINVAL asking the asserter to die. The problem is that an
assert is sent after so many hoops, it is invariably the node that thinks the
asserter is wrong, is actually wrong. So instead of killing the asserter, this
patch kills the assertee.

This patch papers over a race that is still being addressed.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Acked-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
renzhengeek commented 8 years ago

commit c2ec175c39f62949438354f603f4aa170846aabb Author: Nick Piggin npiggin@suse.de Date: Tue Mar 31 15:23:21 2009 -0700

mm: page_mkwrite change prototype to match fault

Change the page_mkwrite prototype to take a struct vm_fault, and return
VM_FAULT_xxx flags.  There should be no functional change.

This makes it possible to return much more detailed error information to
the VM (and also can provide more information eg.  virtual_address to the
driver, which might be important in some special cases).

This is required for a subsequent fix.  And will also make it easier to
merge page_mkwrite() with fault() in future.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <joel.becker@oracle.com>
Cc: Artem Bityutskiy <dedekind@infradead.org>
Cc: Felix Blyakher <felixb@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
renzhengeek commented 8 years ago

commit 1c0845773ad9f4875603b752235aea8aa04565f3 Author: Sunil Mushran sunil.mushran@oracle.com Date: Thu Feb 26 15:00:37 2009 -0800

ocfs2/dlm: Encapsulate adding and removing of mle from dlm->master_list

This patch encapsulates adding and removing of the mle from the
dlm->master_list. This patch is part of the series of patches that
converts the mle list to a mle hash.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>

commit f77a9a78c3a1d995b3bf948dbcad5c4a1b2302d5 Author: Sunil Mushran sunil.mushran@oracle.com Date: Thu Feb 26 15:00:38 2009 -0800

ocfs2/dlm: Clean up struct dlm_lock_name

For master mle, the name it stored in the attached lockres in struct qstr.
For block and migration mle, the name is stored inline in struct dlm_lock_name.
This patch attempts to make struct dlm_lock_name look like a struct qstr. While
we could use struct qstr, we don't because we want to avoid having to malloc
and free the lockname string as the mle's lifetime is fairly short.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>

commit c2cd4a44333034203cb198915e2b75c3227d41bf Author: Sunil Mushran sunil.mushran@oracle.com Date: Thu Feb 26 15:00:39 2009 -0800

ocfs2/dlm: Refactor dlm_clean_master_list()

This patch refactors dlm_clean_master_list() so as to make it
easier to convert the mle list to a hash.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>

commit e2b66ddcce922529e058cf74d839c4c49c8379a1 Author: Sunil Mushran sunil.mushran@oracle.com Date: Thu Feb 26 15:00:40 2009 -0800

ocfs2/dlm: Create and destroy the dlm->master_hash

This patch adds code to create and destroy the dlm->master_hash.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>

commit 2ed6c750d645d09b5948e46fada3ca1fda3157b5 Author: Sunil Mushran sunil.mushran@oracle.com Date: Thu Feb 26 15:00:41 2009 -0800

ocfs2/dlm: Activate dlm->master_hash for master list entries

With this patch, the mles are stored in a hash and not a simple list.
This should improve the mle lookup time when the number of outstanding
masteries is large.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>

commit 67ae1f0604da3bcf3ed6dec59ac71d07e54a404c Author: Sunil Mushran sunil.mushran@oracle.com Date: Thu Feb 26 15:00:42 2009 -0800

ocfs2/dlm: Indent dlm_cleanup_master_list()

The previous patch explicitly did not indent dlm_cleanup_master_list()
so as to make the patch readable. This patch properly indents the
function.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>

commit 2041d8fdcec7603401829f60810c1dbd5e96c043 Author: Sunil Mushran sunil.mushran@oracle.com Date: Thu Feb 26 15:00:43 2009 -0800

ocfs2/dlm: Track number of mles

The lifetime of a mle is limited to the duration of the lockres mastery
process. While typically this lifetime is fairly short, we have noticed
the number of mles explode under certain circumstances. This patch tracks
the number of each different types of mles and should help us determine
how best to speed up the mastery process.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>

commit 6800791ab773453bdec337efb3f0cec6557f3bb3 Author: Sunil Mushran sunil.mushran@oracle.com Date: Thu Feb 26 15:00:44 2009 -0800

ocfs2/dlm: Improve lockres counts

This patch replaces the lockres counts that tracked the number number of
locally and remotely mastered lockres' with a current and total count. The
total count is the number of lockres' that have been created since the dlm
domain was created.

The number of locally and remotely mastered counts can be computed using
the locking_state output.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>

commit 7d62a978a8c85cd82301615840d744f0d83b87e7 Author: Sunil Mushran sunil.mushran@oracle.com Date: Thu Feb 26 15:00:45 2009 -0800

ocfs2/dlm: dlm_set_lockres_owner() and dlm_change_lockres_owner() inlined

This patch inlines dlm_set_lockres_owner() and dlm_change_lockres_owner().

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>

commit e64ff14607ac90b2f3349550a41cc8dc0c0b1324 Author: Sunil Mushran sunil.mushran@oracle.com Date: Thu Feb 26 15:00:46 2009 -0800

ocfs2/dlm: Show the number of lockres/mles in dlm_state

This patch shows the number of lockres' and mles in the debugfs file, dlm_state.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>

commit 7141514b8307734c117f100c4a3637887c5def45 Author: Sunil Mushran sunil.mushran@oracle.com Date: Thu Feb 26 15:00:47 2009 -0800

ocfs2/dlm: Remove struct dlm_lock_name in struct dlm_master_list_entry

This patch removes struct dlm_lock_name and adds the entries directly
to struct dlm_master_list_entry. Under the new scheme, both mles that
are backed by a lockres or not, will have the name populated in mle->mname.
This allows us to get rid of code that was figuring out the location of
the mle name.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>

commit 516b7e52abc7efd61c084b217c61985a403828ed Author: Sunil Mushran sunil.mushran@oracle.com Date: Thu Feb 26 15:00:48 2009 -0800

ocfs2/dlm: Do not purge lockres that is being migrated dlm_purge_lockres()

This patch attempts to fix a fine race between purging and migration.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>

commit 9405dccfd3201d2b76e120949bec81ba8cfbd2d0 Author: Sunil Mushran sunil.mushran@oracle.com Date: Thu Feb 26 15:00:49 2009 -0800

ocfs2/dlm: Tweak mle_state output

The debugfs file, mle_state, now prints the number of largest number of mles
in one hash link.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
renzhengeek commented 8 years ago

commit 1fca3a05ef2823830925dfb66711d6d920265a8d Author: Hisashi Hifumi hifumi.hisashi@oss.ntt.co.jp Date: Thu Mar 5 17:22:21 2009 +0900

ocfs2: Pagecache usage optimization on ocfs2

A page can have multiple buffers and even if a page is not uptodate, some buffers
can be uptodate on pagesize != blocksize environment.
This aops checks that all buffers which correspond to a part of a file
that we want to read are uptodate. If so, we do not have to issue actual
read IO to HDD even if a page is not uptodate because the portion we
want to read are uptodate.
"block_is_partially_uptodate" function is already used by ext2/3/4.
With the following patch random read/write mixed workloads or random read after
random write workloads can be optimized and we can get performance improvement.

Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
renzhengeek commented 8 years ago

commit e04cc15f52eb072937cec2bbd8499f37afe5e1ef Author: Hisashi Hifumi hifumi.hisashi@oss.ntt.co.jp Date: Tue Jun 9 16:47:45 2009 +0900

ocfs2: fdatasync should skip unimportant metadata writeout

In ocfs2, fdatasync and fsync are identical.
I think fdatasync should skip committing transaction when
inode->i_state is set just I_DIRTY_SYNC and this indicates
only atime or/and mtime updates.
Following patch improves fdatasync throughput.

#sysbench --num-threads=16 --max-requests=300000 --test=fileio
--file-block-size=4K --file-total-size=16G --file-test-mode=rndwr
--file-fsync-mode=fdatasync run

Results:
-2.6.30-rc8
Test execution summary:
    total time:                          107.1445s
    total number of events:              119559
    total time taken by event execution: 116.1050
    per-request statistics:
         min:                            0.0000s
         avg:                            0.0010s
         max:                            0.1220s
         approx.  95 percentile:         0.0016s

Threads fairness:
    events (avg/stddev):           7472.4375/303.60
    execution time (avg/stddev):   7.2566/0.64

-2.6.30-rc8-patched
Test execution summary:
    total time:                          86.8529s
    total number of events:              300016
    total time taken by event execution: 24.3077
    per-request statistics:
         min:                            0.0000s
         avg:                            0.0001s
         max:                            0.0336s
         approx.  95 percentile:         0.0001s

Threads fairness:
    events (avg/stddev):           18751.0000/718.75
    execution time (avg/stddev):   1.5192/0.05

Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp>
Acked-by: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
renzhengeek commented 8 years ago

commit 94e41ecfe0f202df948fdbb19a53308a58cf2184 Author: Sunil Mushran sunil.mushran@oracle.com Date: Fri Jun 19 14:45:54 2009 -0700

ocfs2: Pin journal head before accessing jh->b_committed_data

This patch adds jbd_lock_bh_state() and jbd_unlock_bh_state() around accessses
to jh->b_committed_data.

Fixes oss bugzilla#1131
http://oss.oracle.com/bugzilla/show_bug.cgi?id=1131

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
renzhengeek commented 8 years ago

commit c795b33ba171e41563ab7e25105c0cd4edd81cd7 Author: Goldwyn Rodrigues rgoldwyn@gmail.com Date: Thu Aug 20 13:43:19 2009 -0500

ocfs2/dlm: Wait on lockres instead of erroring cancel requests

In case a downconvert is queued, and a flock receives a signal,
BUG_ON(lockres->l_action != OCFS2_AST_INVALID) is triggered
because a lock cancel triggers a dlmunlock while an AST is
scheduled.

To avoid this, allow a LKM_CANCEL to pass through, and let it
wait on __dlm_wait_on_lockres().

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.de>
Acked-off-by: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
renzhengeek commented 8 years ago

commit 8cb471e8f82506937fe5e2e9fb0bf90f6b1f1170 Author: Joel Becker joel.becker@oracle.com Date: Tue Feb 10 20:00:41 2009 -0800

ocfs2: Take the inode out of the metadata read/write paths.

We are really passing the inode into the ocfs2_read/write_blocks()
functions to get at the metadata cache.  This commit passes the cache
directly into the metadata block functions, divorcing them from the
inode.

Signed-off-by: Joel Becker <joel.becker@oracle.com>

commit 66fb345ddd2d343e36692da0ff66126d7a99dc1b Author: Joel Becker joel.becker@oracle.com Date: Thu Feb 12 15:24:40 2009 -0800

ocfs2: move ip_last_trans to struct ocfs2_caching_info

We have the read side of metadata caching isolated to struct
ocfs2_caching_info, now we need the write side.  This means the journal
functions.  The journal only does a couple of things with struct inode.

This change moves the ip_last_trans field onto struct
ocfs2_caching_info as ci_last_trans.  This field tells the journal
whether a pending journal flush is required.

Signed-off-by: Joel Becker <joel.becker@oracle.com>

commit 292dd27ec76b96cebcef576f330ab121f59ccf05 Author: Joel Becker joel.becker@oracle.com Date: Thu Feb 12 15:41:59 2009 -0800

ocfs2: move ip_created_trans to struct ocfs2_caching_info

Similar ip_last_trans, ip_created_trans tracks the creation of a journal
managed inode.  This specifically tracks what transaction created the
inode.  This is so the code can know if the inode has ever been written
to disk.

This behavior is desirable for any journal managed object.  We move it
to struct ocfs2_caching_info as ci_created_trans so that any object
using ocfs2_caching_info can rely on this behavior.

Signed-off-by: Joel Becker <joel.becker@oracle.com>

commit 0cf2f7632b1789b811ab20b611c4156e6de2b055 Author: Joel Becker joel.becker@oracle.com Date: Thu Feb 12 16:41:25 2009 -0800

ocfs2: Pass struct ocfs2_caching_info to the journal functions.

The next step in divorcing metadata I/O management from struct inode is
to pass struct ocfs2_caching_info to the journal functions.  Thus the
journal locks a metadata cache with the cache io_lock function.  It also
can compare ci_last_trans and ci_created_trans directly.

This is a large patch because of all the places we change
ocfs2_journal_access..(handle, inode, ...) to
ocfs2_journal_access..(handle, INODE_CACHE(inode), ...).
renzhengeek commented 8 years ago

commit 918941a3f3d46c2a69971b4718aaf13b1be2f1a7 Author: Jan Kara jack@suse.cz Date: Mon Aug 17 18:50:08 2009 +0200

ocfs2: Use __generic_file_aio_write instead of generic_file_aio_write_nolock

Use the new helper. We have to submit data pages ourselves in case of O_SYNC
write because __generic_file_aio_write does not do it for us. OCFS2 developpers
might think about moving the sync out of i_mutex which seems to be easily
possible but that's out of scope of this patch.

CC: ocfs2-devel@oss.oracle.com
Acked-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Jan Kara <jack@suse.cz>
renzhengeek commented 8 years ago

commit 8dec98edfe9684ce00b580a09dde3dcd21ee785b Author: Tao Ma tao.ma@oracle.com Date: Tue Aug 18 11:19:58 2009 +0800

ocfs2: Add new refcount tree lock resource in dlmglue.

refcount tree lock resource is used to protect refcount
tree read/write among multiple nodes.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
renzhengeek commented 8 years ago

commit c732eb16bf07f9bfb7fa72b6868462471273bdbd Author: Tao Ma tao.ma@oracle.com Date: Tue Aug 18 11:21:00 2009 +0800

ocfs2: Add caching info for refcount tree.

refcount tree should use its own caching info so that when
we downconvert the refcount tree lock, we can drop all the
cached buffer head.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
renzhengeek commented 8 years ago

commit bf48aabb894fd0639ab72a26e8abbd7683ef23c2 Author: Uwe Kleine-König u.kleine-koenig@pengutronix.de Date: Wed Oct 28 20:11:03 2009 +0100

tree-wide: fix typos "offest" -> "offset"

This patch was generated by

    git grep -E -i -l 'offest' | xargs -r perl -p -i -e 's/offest/offset/'

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
renzhengeek commented 8 years ago

commit 71656fa6ec10473eb9b646c10a2173fdea2f83c9 Author: Sunil Mushran sunil.mushran@oracle.com Date: Mon Jan 25 16:57:39 2010 -0800

ocfs2/dlm: Ignore LVBs of locks in the Blocked list

During lock resource migration, o2dlm fills the packet with a LVB from the
first valid lock. For sanity, it ensures that the other valid locks have the
same LVB. If not, it BUGs.

The valid locks are ones that have granted EX or PR lock levels and are either
on the Granted or Converting lists. Locks in the Blocked list cannot have a
valid LVB.

This patch ensures that we skip the locks in the Blocked list.

Fixes oss bugzilla#1202
http://oss.oracle.com/bugzilla/show_bug.cgi?id=1202

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
renzhengeek commented 8 years ago

commit 0b94a909eb2e2f6990d05fd486a0cb4902ef1ae7 Author: Wengang Wang wen.gang.wang@oracle.com Date: Thu Jan 21 10:50:02 2010 -0800

ocfs2: Fix setting of OCFS2_LOCK_BLOCKED during bast

During bast, set the OCFS2_LOCK_BLOCKED flag only if the lock needs to
downconverted.

Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
Acked-by: Sunil Mushran <sunil.mushran@oracle.com>
Acked-by: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
renzhengeek commented 8 years ago

commit a19128260107f951d1b4c421cf98b92f8092b069 Author: Sunil Mushran sunil.mushran@oracle.com Date: Thu Jan 21 10:50:03 2010 -0800

ocfs2: Prevent a livelock in dlmglue

There is possibility of a livelock in __ocfs2_cluster_lock(). If a node were
to get an ast for an upconvert request, followed immediately by a bast,
there is a small window where the fs may downconvert the lock before the
process requesting the upconvert is able to take the lock.

This patch adds a new flag to indicate that the upconvert is still in
progress and that the dc thread should not downconvert it right now.

Wengang Wang <wen.gang.wang@oracle.com> and Joel Becker
<joel.becker@oracle.com> contributed heavily to this patch.

Reported-by: David Teigland <teigland@redhat.com>
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
renzhengeek commented 8 years ago

commit 0d74125a6a68d4f1969ecaf0b3543f315916ccdc Author: Sunil Mushran sunil.mushran@oracle.com Date: Fri Jan 29 09:44:11 2010 -0800

ocfs2: Do not downconvert if the lock level is already compatible

During upconvert, if the master were to send a BAST, dlmglue will detect the
upconversion in process and send a cancel convert to the master. Upon receiving
the AST for the cancel convert, it will re-process the lock resource to determine
whether it needs downconverting. Say, the up was from PR to EX and the BAST was
for EX. After the cancel convert, it will need to downconvert to NL.

However, if the node was originally upconverting from NL to EX, then there would
be no reason to downconvert (assuming the same message sequence).

This patch makes dlmglue consider the possibility that the current lock level
is already compatible and that downconverting is not required.

Joel Becker <joel.becker@oracle.com> assisted in fixing this issue.

Fixes ossbz#1178
http://oss.oracle.com/bugzilla/show_bug.cgi?id=1178

Reported-by: Coly Li coly.li@suse.de Signed-off-by: Sunil Mushran sunil.mushran@oracle.com Signed-off-by: Joel Becker joel.becker@oracle.com

renzhengeek commented 8 years ago

commit 079b805782f94f4b278132286a8c9bc4655d1c51 Author: Sunil Mushran sunil.mushran@oracle.com Date: Wed Feb 3 10:16:54 2010 -0800

ocfs2: Plugs race between the dc thread and an unlock ast message

This patch plugs a race between the downconvert thread and an unlock ast message.
Specifically, after the downconvert worker has done its task, the dc thread needs
to check whether an unlock ast made the downconvert moot.

Reported-by: David Teigland <teigland@redhat.com>
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Acked-by: Mark Fasheh <mfasheh@sus.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
renzhengeek commented 8 years ago

commit 34a9dd7e29e9129fec40c645a03f1bbbe810e771 Author: Joel Becker joel.becker@oracle.com Date: Thu Jan 28 15:00:49 2010 -0800

ocfs2_dlmfs: Move to its own directory

We're going to remove the tie between ocfs2_dlmfs and o2dlm.
ocfs2_dlmfs doesn't belong in the fs/ocfs2/dlm directory anymore.  Here
we move it to fs/ocfs2/dlmfs.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
renzhengeek commented 8 years ago

commit a796d2862aed8117acc9f470f3429a5ee852912e Author: Joel Becker joel.becker@oracle.com Date: Thu Jan 28 19:22:39 2010 -0800

ocfs2: Pass lksbs back from stackglue ast/bast functions.

The stackglue ast and bast functions tried to maintain the fiction that
their arguments were void pointers.  In reality, stack_user.c had to
know that the argument was an ocfs2_lock_res in order to get the status
off of the lksb.  That's ugly.

This changes stackglue to always pass the lksb as the argument to ast
and bast functions.  The caller can always use container_of() to get the
ocfs2_lock_res or user_dlm_lock_res.  The net effect to the caller is
zero.  They still get back the lockres in their ast.  stackglue gets
cleaner, and now can use the lksb itself.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
renzhengeek commented 8 years ago

commit c0e4133851ed94c73ee3d34a2f2a245fcd0a60a1 Author: Joel Becker joel.becker@oracle.com Date: Fri Jan 29 14:46:44 2010 -0800

ocfs2: Attach the connection to the lksb

We're going to want it in the ast functions, so we convert union
ocfs2_dlm_lksb to struct ocfs2_dlm_lksb and let it carry the connection.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
renzhengeek commented 8 years ago

commit e8fce482f3702c1ad27c97b26db5022aa1fa64c7 Author: Joel Becker joel.becker@oracle.com Date: Tue Feb 9 17:52:13 2010 -0800

ocfs2_dlmfs: Don't honor truncate.  The size of a dlmfs file is LVB_LEN

We want folks using dlmfs to be able to use the LVB in places other than
just write(2)/read(2).  By ignoring truncate requests, we allow 'echo
"contents" > /dlm/space/lockname' to work.
renzhengeek commented 8 years ago

commit bc9838c4d44a1713ab1bf24aa6675bc3a02b6a88 Author: Srinivas Eeda srinivas.eeda@oracle.com Date: Fri Feb 26 12:53:51 2010 -0800

dlm: allow dlm do recovery during shutdown

If a node down event happens while dlm shutdown in progress, dlm recovery
should be done before dlm is shutdown.  We can't migrate unrecovered locks,
obviously.  But dlm_reco_thread only does recovery if the dlm_state is
in DLM_CTXT_JOINED.

dlm_reco_thread should do recovery if dlm_state is in DLM_CTXT_JOINED or
DLM_CTXT_IN_SHUTDOWN.

Signed-off-by: Srinivas Eeda <srinivas.eeda@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
renzhengeek commented 8 years ago

commit 14741472a05245ed5778aa0aec055e1f920b6ef8 Author: Srinivas Eeda srinivas.eeda@oracle.com Date: Mon Mar 22 16:50:47 2010 -0700

ocfs2: Fix a race in o2dlm lockres mastery

In o2dlm, the master of a lock resource keeps a map of all interested
nodes.  This prevents the master from purging the resource before an
interested node can create a lock.

A race between the mastery thread and the mastery handler allowed an
interested node to discover who the master is without informing the
master directly.  This is easily fixed by holding the dlm spinlock a
little longer in the mastery handler.

Signed-off-by: Srinivas Eeda <srinivas.eeda@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
renzhengeek commented 8 years ago

commit a36d515c7a2dfacebcf41729f6812dbc424ebcf0 Author: Joel Becker joel.becker@oracle.com Date: Fri Apr 23 15:24:59 2010 -0700

ocfs2_dlmfs: Fix math error when reading LVB.

When asked for a partial read of the LVB in a dlmfs file, we can
accidentally calculate a negative count.

Reported-by: Dan Carpenter <error27@gmail.com>
Cc: <stable@kernel.org>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
renzhengeek commented 8 years ago

this commit maybe the root cause:

commit e9dfc0b2bc42761410e8db6c252c6c5889e178b8
Author: Mark Fasheh <mark.fasheh@oracle.com>
Date:   Mon May 14 11:38:51 2007 -0700

    ocfs2: trylock in ocfs2_readpage()

    Similarly to the page lock / cluster lock inversion in ocfs2_readpage, we
    can deadlock on ip_alloc_sem. We can down_read_trylock() instead and just
    return AOP_TRUNCATED_PAGE if the operation fails.

    Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
index 8e7cafb..3030670 100644
--- a/fs/ocfs2/aops.c
+++ b/fs/ocfs2/aops.c
@@ -222,7 +222,10 @@ static int ocfs2_readpage(struct file *file, struct page *page)
                goto out;
        }

-       down_read(&OCFS2_I(inode)->ip_alloc_sem);
+       if (down_read_trylock(&OCFS2_I(inode)->ip_alloc_sem) == 0) {
+               ret = AOP_TRUNCATED_PAGE;
+               goto out_meta_unlock;
+       }

        /*
         * i_size might have just been updated as we grabed the meta lock.  We
@@ -258,6 +261,7 @@ static int ocfs2_readpage(struct file *file, struct page *page)
        ocfs2_data_unlock(inode, 0);
 out_alloc:
        up_read(&OCFS2_I(inode)->ip_alloc_sem);
+out_meta_unlock:
        ocfs2_meta_unlock(inode, 0);
 out:
        if (unlock)