pmem / pmemfile

Userspace implementation of file APIs using persistent memory.
Other
34 stars 21 forks source link

Use readlock in lseek, instead of writelock #300

Closed GBuella closed 7 years ago

GBuella commented 7 years ago

This change is Reviewable

codecov-io commented 7 years ago

Codecov Report

Merging #300 into master will increase coverage by <.01%. The diff coverage is 66.66%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #300      +/-   ##
==========================================
+ Coverage    79.3%   79.31%   +<.01%     
==========================================
  Files          81       81              
  Lines       12061    12061              
  Branches     1630     1628       -2     
==========================================
+ Hits         9565     9566       +1     
+ Misses       1890     1885       -5     
- Partials      606      610       +4
Impacted Files Coverage Δ
src/libpmemfile-posix/inode.h 100% <ø> (ø) :arrow_up:
src/libpmemfile-posix/read.c 93.91% <50%> (+1.11%) :arrow_up:
src/libpmemfile-posix/lseek.c 91.47% <66.66%> (+4.25%) :arrow_up:
src/libpmemfile-posix/inode.c 88.88% <76.92%> (-0.24%) :arrow_down:
src/libpmemfile/syscall_early_filter.c 60% <0%> (-20%) :arrow_down:
src/libpmemfile-posix/file.c 65.38% <0%> (-0.83%) :arrow_down:
src/libpmemfile-posix/dir.c 81.81% <0%> (-0.21%) :arrow_down:
src/libpmemfile-posix/data.c 92.24% <0%> (-0.03%) :arrow_down:
src/libpmemfile-posix/rename.c 91.53% <0%> (+0.03%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 15ad969...a343c48. Read the comment docs.

krzycz commented 7 years ago

Reviewed 3 of 4 files at r1. Review status: 3 of 4 files reviewed at latest revision, 1 unresolved discussion.


src/libpmemfile-posix/inode.c, line 663 at r1 (raw file):

 * tree to be valid, but do not modify the underlying file at all.
 *
 * If the block_tree doesn't exist we have to drop the lock we hold,

That puzzled me. Perhaps we need two locks per vinode?


Comments from Reviewable

krzycz commented 7 years ago

Reviewed 1 of 4 files at r1. Review status: all files reviewed at latest revision, 1 unresolved discussion.


Comments from Reviewable

GBuella commented 7 years ago

Review status: all files reviewed at latest revision, 1 unresolved discussion.


src/libpmemfile-posix/inode.c, line 663 at r1 (raw file):

Previously, krzycz (Krzysztof Czurylo) wrote…
That puzzled me. Perhaps we need two locks per vinode?

If you mean a separate mutex inside each vinode called something like block_tree_rebuild_mutex, that actually sounds like a good idea. Marcin, what do you think?


Comments from Reviewable

marcinslusarz commented 7 years ago
:lgtm:

Reviewed 4 of 4 files at r1. Review status: all files reviewed at latest revision, 1 unresolved discussion.


src/libpmemfile-posix/inode.c, line 663 at r1 (raw file):

Previously, GBuella (Gabor Buella) wrote…
If you mean a separate mutex inside each vinode called something like `block_tree_rebuild_mutex`, that actually sounds like a good idea. Marcin, what do you think?

It's not only about tree rebuild. Block tree needs to be consistent with blocks, so we need the same lock as is taken when writing (actually deallocating / allocating / initializing blocks). BTW, if we could fast & safely say whether metadata changed, very similar algo could be applied to overwriting.


Comments from Reviewable

GBuella commented 7 years ago

Review status: all files reviewed at latest revision, 1 unresolved discussion.


src/libpmemfile-posix/inode.c, line 663 at r1 (raw file):

we need the same lock as is taken when writing We would still hold the same lock as well -- the read-write-lock of the vinode, locked for reading. Noone should be able to do modifications while that.

Anyways, we expect it to be a very rare event to when the block tree is invalid, right? Only on first access, and after aborted transactions I guess. So this shouldn't matter much.


Comments from Reviewable

marcinslusarz commented 7 years ago

src/libpmemfile-posix/inode.c, line 663 at r1 (raw file):

Previously, GBuella (Gabor Buella) wrote…
> we need the same lock as is taken when writing We would still hold the same lock as well -- the read-write-lock of the vinode, locked for reading. Noone should be able to do modifications while that. Anyways, we expect it to be a very rare event to when the block tree is invalid, right? Only on first access, and after aborted transactions I guess. So this shouldn't matter much.

Yeah, that's a good idea (btw, you put first 2 sentences into the quote, which looks like I wrote that). The lock would only be needed for rebuilding, so we would have to be careful to update the pointer only when tree is fully populated.

It's true it's rare event now, but it may become more frequent with process switching.


Comments from Reviewable

sarahjelinek commented 7 years ago

Reviewed 4 of 4 files at r1. Review status: all files reviewed at latest revision, 3 unresolved discussions.


src/libpmemfile-posix/inode.c, line 678 at r1 (raw file):

{
  os_rwlock_rdlock(&vinode->rwlock);
  while (!vinode->blocks) {

Why do you need a 'while' loop and then the subsequent if (!vnode->blocks)? I assume in case another thread gets in between dropping the read lock and taking the write lock and rebuilds the tree?

Also, if the vinode->blocks != NULL then are we guaranteed at this point that this block cache is up to date? I assume so since we don't check for any other flag.


src/libpmemfile-posix/lseek.c, line 137 at r1 (raw file):

      return -ENXIO;
  }

Add a comment here saying that if the vinode_rdlock_xxx succeeds the read lock will be held.


Comments from Reviewable

GBuella commented 7 years ago

Review status: all files reviewed at latest revision, 3 unresolved discussions.


src/libpmemfile-posix/inode.c, line 678 at r1 (raw file):

Previously, sarahjelinek (Sarah Jelinek) wrote…
Why do you need a 'while' loop and then the subsequent if (!vnode->blocks)? I assume in case another thread gets in between dropping the read lock and taking the write lock and rebuilds the tree? Also, if the vinode->blocks != NULL then are we guaranteed at this point that this block cache is up to date? I assume so since we don't check for any other flag.

I relocated this code section from read.c, originally written by Marcin.

See original discussion in the initial PR: https://github.com/pmem/pmemfile/pull/95#discussion_r114783010

Well, I realize reading that does help a lot. Yes, the while loop is needed, since some other might sneak in between the unlock + lock here.

Also, anywhere in libpmemfile-posix, the vinode->blocks pointer (and whatever it points to ) can only be modified while holding the vinode lock . Anyone anywhere holding that lock must leave that vinode->blocks pointer either in a valid state (pointing to a complete tree), or leave it as NULL before releasing the lock. Therefore, when we acquire the lock here, and see that the vinode->blocks pointer is not NULL, we assume things are OK.


Comments from Reviewable

GBuella commented 7 years ago

Review status: 2 of 4 files reviewed at latest revision, 2 unresolved discussions.


src/libpmemfile-posix/lseek.c, line 137 at r1 (raw file):

Previously, sarahjelinek (Sarah Jelinek) wrote…
Add a comment here saying that if the vinode_rdlock_xxx succeeds the read lock will be held.

Done.


Comments from Reviewable

sarahjelinek commented 7 years ago
:lgtm:

Reviewed 2 of 2 files at r2. Review status: all files reviewed at latest revision, 1 unresolved discussion.


Comments from Reviewable

marcinslusarz commented 7 years ago

Reviewed 2 of 2 files at r2. Review status: all files reviewed at latest revision, 1 unresolved discussion.


Comments from Reviewable