mar-file-system / marfs

MarFS provides a scalable near-POSIX file system by using one or more POSIX file systems as a scalable metadata component and one or more data stores (object, file, etc) as a scalable data component.
Other
96 stars 26 forks source link

Two senses of marfs "version" #35

Open jti-lanl opened 8 years ago

jti-lanl commented 8 years ago

There are two senses in which we have configuration versions:

(1) config-file parsing. In this case the config-file has a version, and the config-reader could compare that with some hardwired defines, to assure that it is competent to perform the parse.

(2) xattr parsing. Any objects written by a given version of the software are stamped with the "version" that was in effect when they were written. This includes xattr-parsers, and xattr writers (e.g. str_2_pre(), and pre_2_str(), respectively).

But these are two different things. Additions to the config-file structure may require changes in the config-reader, so maybe it makes sense that there would be #defines that identify the SW version, which the reader would compare with the config file. But the same is true of the xattr parser/writer. And the two can change independently, though they are related.

So, what do we really care about? We really care about xattrs on files. The SW should stamp the xattrs with a SW version-number, so that we can know how they were written, so we can know how to read them. It may also tell us something about how chunk-info is formatted inside MD files, or about how recovery-info is formatted in objects.

The "version" in the config file is an independent thing, which could let the config-reader know something about how to read it. Maybe it's not needed? The config reader is just supposed to be flexible? But if you have a newer config-file, don't you want old software to realize that it is out of its depth?

I propose that the version in the config-file pertains to the config-reader only, and that the version written into xattrs is a different thing. Thus, we have two sets of #defines. One tells us about what kinds of configuration-versions we can handle, and changes whenever the configuration-reader changes. The other goes into objects, and tells us how we have to read them and their metadata.

brettkettering commented 8 years ago

This sounds fine to have the MarFS code read the configuration file and look at its version. This would allow it to decide if it can parse it or not.

Where do you stash the version for the xattrs?

What are the actions here? We need to create issues on which people can take action.

jti-lanl commented 8 years ago

We currently have marfs version 1.0, which is encoded in object-IDs, POST xattrs, recovery-info, etc. It should be understood to refer to everything that has a format that someone might have to read. If we ever change something that affects any of that (e.g. new way of packing files), we should increment the version. (Minor versions imply backward-compatibility within a given major version, and new major versions are incompatible in some way with older major versions.)

This should probably be considered unrelated to the "MarFS version" at github. Or maybe it's confusing if they can be different? If we want them to be the same, we don't have to tag a new github version every time we tweak the format-version, but whenever we do tag a github version we'd use the current format-version (which would have to be incremented, if it hadn't changed otherwise).