NVSL / linux-nova

NOVA is a log-structured file system designed for byte-addressable non-volatile memories, developed at the University of California, San Diego.
http://nvsl.ucsd.edu/index.php?path=projects/nova
Other
421 stars 117 forks source link

Remove unneeded fields from struct nova_inode #55

Open stevenjswanson opened 6 years ago

stevenjswanson commented 6 years ago

We no longer do in-place update of inode metadata, so we don't most of it in struct nova_inode.

The exception is atime, because doing a log append every read is a terrible idea.

Potential benefits include

Here's an email thread discussing the options

On Sep 13, 2017, at 2:45 PM, Andiry Xu <jix024@eng.ucsd.edu> wrote:

Then you have to store them somewhere - for example, at the head of
the log. That means you need at least one log page for each inode.
Currently a file inode can have no log, as long as you don't do any
operation to it.

Thanks,
Andiry

On Wed, Sep 13, 2017 at 2:42 PM, Steven Swanson <swanson@eng.ucsd.edu> wrote:
But why store any of them in the inode?  We will have to scan the log to populate VFS’s struct inode in any case, so why not just do that for all of them?

Dividing them into groups will just make it more complicated.

-steve

On Sep 13, 2017, at 2:37 PM, Andiry Xu <jix024@eng.ucsd.edu> wrote:

Think about the "unused" field again, seems removing them from inode
struct is not that simple.

These fields are necessary to describe the inode, and they must be
stored somewhere, either in inode or in log.

I think we should distinguish the fields as "frequently updated" and
"infrequently updated". For those that are not frequently updated
(uid, gid, inode number, etc), store them in inode. Otherwise in log.
The idea is not storing the same field at two different places.

Thanks,
Andiry

On Wed, Sep 13, 2017 at 2:32 PM, Andiry Xu <jix024@eng.ucsd.edu> wrote:
That is a problem. This atime updating is implemented in NOVA, at that
time there is no snapshot support. Now atime update is a problem with
snapshot enabled.

However, we also don't want to append the log each time to update the
atime - that will hurt performance for reads.

Thanks,
Andiry

On Wed, Sep 13, 2017 at 2:21 PM, Steven Swanson <swanson@eng.ucsd.edu> wrote:
This is worth thinking about for at least this reason:

We all our times are currently stored at the second granularity, which is probably too coarse.  I didn’t increase the accuracy because it’d make the inode take another cache line.

What does updating atime in place mean about atime in snapshots?

-steve

On Sep 13, 2017, at 2:05 PM, Andiry Xu <jix024@eng.ucsd.edu> wrote:

Some reasons:

1. It was based on PMFS inode structure, I only add more fields to the
struct, not delete them. For sure we don't need some of the fields.

2. Some metadata are update in-place, such as atime. I don't know
other fields are updated in-place, despite log tail/head.

Thanks,
Andiry

On Wed, Sep 13, 2017 at 1:47 PM, Steven Swanson <swanson@eng.ucsd.edu> wrote:
Why do we keep all the metadata in nova_inode?    Since the latest values are often in the log, why do we also have a copy in the inode itself?

I guess it make sense for the in-place metadata updates, but do we do in-place metadata updates?

-steve