openzfs / zfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
10.55k stars 1.74k forks source link

We probably should not upgrade pools if there are incompatible filesystems #2616

Closed dswartz closed 10 years ago

dswartz commented 10 years ago

Someone on a forum I hang out on had a Solaris 11.1 pool (S 11.1 supports filesystem version 6). Long story short: he booted OmniOS and got a warning about the pool being out of date (since S 11.1 doesn't support version 5000). So he did 'zpool upgrade -a' and it said it succeeded. The filesystem in question was then marked as not-mountable, since it was an unsupported filesystem version (6). I understand this is not OmniOS, but I would be surprised if we don't have the same landmine waiting to be stepped on. I don't know how feasible it is to scan the list of filesystems on the pool we are going to upgrade and bail out if any of them are an unsupported version.

FransUrbo commented 10 years ago

So he

  1. had a Solaris 11.1 pool
  2. booted into OmniOS, imported the pool and mounted all (even the v6 ones?) filesystems
  3. upgraded the pool to version 5000

and the v6 filesystem(s) was then marked as unmountable in OmniOS? Or on Solaris?

Why would the pool be mountable before pool version 5000, but not afterwards!?

dswartz commented 10 years ago

So he

  1. had a Solaris 11.1 pool
  2. booted into OmniOS, imported the pool and mounted all (even the v6 ones?) filesystems
  3. upgraded the pool to version 5000

and the v6 filesystem(s) was then marked as unmountable in OmniOS? Or on Solaris?

Why would the pool be mountable before pool version 5000, but not afterwards!?

Good point. I assumed he must have done the upgrade before trying to import the pool, but I've tried that under OmniOS and a non-mounted pool isn't available as an upgrade target. I'm going to get clarification... The issue is that only solaris supports fs=6 so I don't see how he could have mounted the pool to upgrade it...

dswartz commented 10 years ago

Here is what he did:

Solaris 11.1: zpool create -o version=28 -O version=5 pool raidz2 disk1 disk2 disk3 etc zfs create pool/I_AM_VERSION_6

Remove disks from Solaris 11.1, and put in OmniOS. zpool import pool (Pool successfully imports. pool/I_AM_VERSION_6 doesn't. OmniOS tells me to upgrade.) zpool upgrade pool

pool is now filesystem 5, zfs version 5000. pool/I_AM_VERSION_6 is still filesystem 6.

You can no longer mount pool on Solaris 11.1 because of the ZFS version mismatch. You can not mount pool/I_AM_VERSION_6 on OmniOS because of the ZFS filesystem mismatch. Your data still exists, but you can't get to it.

FransUrbo commented 10 years ago

You can no longer mount pool on Solaris 11.1 because of the ZFS version mismatch.

I'm assuming that should read 'import pool'.. And I doubt there's anything we can do about that. ZFS can always import a pool with lower version, but not higher (5000 is higher than 30 or whatever Solaris is at now).

So until Oracle adds support for importing a pool with version 5000, this won't change. Meaning, this is not a problem at our end, it's in Oracles hands... Good luck getting THAT fixed! :)

You can not mount pool/I_AM_VERSION_6 on OmniOS because of the ZFS filesystem mismatch.

Until we get/add support for version 6 filesystem, not much we can do about that either....

But you might be right about the "don't allow upgrade if any fs is v6" part though. It should AT LEAST be documented that this might be a bad idea.

On the other hand, from what I understand the filesystem was inaccessible both before and after the upgrade, so the upgrade ONLY ruined dual boot in Solaris, and that was expected, I think it's even documented in the man page that this would happen. It's a well known fact at least...

It is "well known" that IF you are to dual import the pool on different implementations of ZFS, you need to think very, very carefully what and how you create it.

So in my opinion this issue should be closed - not much we can do about it and an upgrade didn't change the fact that the filesystem(s) in question/issue was inaccessible anyway.

dswartz commented 10 years ago

You can no longer mount pool on Solaris 11.1 because of the ZFS version mismatch.

I'm assuming that should read 'import pool'.. And I doubt there's anything we can do about that. ZFS can always import a pool with lower version, but not higher (5000 is higher than 30 or whatever Solaris is at now).

I assume that's what he meant. Keep in mind I was quoting him literally here. I agree about 'not anything we can do here' (nor was I intending to imply otherwise...)

So until Oracle adds support for importing a pool with version 5000, this won't change. Meaning, this is not a problem at our end, it's in Oracles hands... Good luck getting THAT fixed! :)

Agreed.

You can not mount pool/I_AM_VERSION_6 on OmniOS because of the ZFS filesystem mismatch.

Until we get/add support for version 6 filesystem, not much we can do about that either....

Agreed.

But you might be right about the "don't allow upgrade if any fs is v6" part though. It should AT LEAST be documented that this might be a bad idea.

On the other hand, from what I understand the filesystem was inaccessible both before and after the upgrade, so the upgrade ONLY ruined dual boot in Solaris, and that was expected, I think it's even documented in the man page that this would happen. It's a well known fact at least...

It is "well known" that IF you are to dual import the pool on different implementations of ZFS, you need to think very, very carefully what and how you create it.

So in my opinion this issue should be closed - not much we can do about it and an upgrade didn't change the fact that the filesystem(s) in question/issue was inaccessible anyway.

Well, here I disagree. I know it's unavoidable that you can move a version X pool to another platform and upgrade it to a version that won't work on the original host. My point in this case is that we have a smoking gun here - we can see that there is at least one filesystem whose version we do not support, and maybe should not allow a pool upgrade (or at least require a '-f' option to 'zpool upgrade' to proceed (and describe why we are requiring this.) I am perfectly happy with this being a feature request as opposed to a bug, per-se. I originally wanted to add it as such, but apparently do not have permissions to do so.

dswartz commented 10 years ago

So in my opinion this issue should be closed - not much we can do about it and an upgrade didn't change the fact that the filesystem(s) in question/issue was inaccessible anyway.

To be a little clearer here: the filesystem was inaccessible under ZoL, not under Solaris 11.1 - we changed things such that it became inaccessible there too (rendering his data inaccessible.) We can't know for sure that the presence of one or more filesystems with versions we don't support implies the situation that was extant, but it is sure suspicious (and maybe worthy of a '-f' on the import...)

FransUrbo commented 10 years ago

On 21 aug 2014, at 15:19, dswartz notifications@github.com wrote:

I know it's unavoidable that you can move a version X pool to another platform and upgrade it to a version that won't work on the original host.

The whole point is that the pool would be inaccible wether or not it's a v6 FS there or not...

IF you choose to upgrade (or enable a feature), the pool WILL be inaccible! Period.

This is partly true between OpenZFS and ZoL as well - enable a feature that the other don't support and presto, unimportable!

The fact that there's a v6 FS there is irrelevant.

IF you unstead relabeled the issue to

"ZoL does not support v6 filesystems"

then I'm all for it. But as it is now, I think the issue is moot and pointless - upgrading a pool will render the pool unimportable on Solaris. Period.

And it's not our 'fault' (out of our control). So "won't fix" gets my vote...

dswartz commented 10 years ago

I'm not sure how to explain it any more clearly. This falls in the category of 'do no harm'. If someone types 'zpool upgrade tank', and the upgrade code does a quick scan of the filesystems on that pool and sees that there is one or more filesystem that is not supported by the running code, it's very likely NOT a good idea to upgrade the zpool version (unless, as I said before, the '-f' flag is specified.) If the upgrade is going from a pre-features version (< 5000) to a features version, and we see one or more filesystems with a version we don't understand, there is a high likelihood we will be screwing the user by doing so. It may be deemed by Brian or someone else that this is a feature we don't have the time or effort to justify implementing, but I disagree that it is moot and pointless.

FransUrbo commented 10 years ago

I'm not sure how to explain it any more clearly.

I could say the same... Were does the users responibility to question [what' said to them], reading [documentation] and generally finding out if what they're doing (or about to do) is a good idea begin and our resposibility to protect them from their own stupidity begin?

If someone is stupid and uneducated enough to do what they're told, without questioning or protesting, please give them my mail adress so I could tell them to go jump of a tall building...

It is not our responibillity to check what they type. If they 'say' "upgrade pool", then darn it, that is what we should do! No more, no less.

There are several things that would make an upgrade a bad idea/unadvisable (unsuported filesystem versions and features the most obvious). We can't/shouldn't check them all...

dswartz commented 10 years ago

We obviously have very different philosophies of what well-behaved software should do. I've never argued for protecting against every possible mishap, but people DO make honest mistakes, and rendering their data totally inaccessible in a (hopefully) easily diagnosable case is not very user friendly, IMO. Obviously we are not going to come to an agreement here, so I think this needs to be punted to Brian or someone else who can decide. Again, I'm not claiming this MUST be done - just that 'it would be nice'. If it's not as easy to do as I had hoped, Brian can say 'no thanks'.

p.s. here is an example that ZFS does do (at least it does so with OmniOS):

if you are adding a cache or log device and you fat-finger and forget the keyword, you would add a single-device vdev to the pool, creating a single point of failure. This cannot be undone - ouch. If you try this, you get the following warning:

pool: tank state: ONLINE scan: none requested config:

    NAME        STATE     READ WRITE CKSUM
    tank        ONLINE       0     0     0
      mirror-0  ONLINE       0     0     0
        c5t1d0  ONLINE       0     0     0
        c5t2d0  ONLINE       0     0     0

root@omnios-ha1:~# zpool add tank c5t3d0 invalid vdev specification use '-f' to override the following errors: mismatched replication level: pool uses mirror and new vdev is disk

and if I force it:

pool: tank state: ONLINE scan: none requested config:

    NAME        STATE     READ WRITE CKSUM
    tank        ONLINE       0     0     0
      mirror-0  ONLINE       0     0     0
        c5t1d0  ONLINE       0     0     0
        c5t2d0  ONLINE       0     0     0
      c5t3d0    ONLINE       0     0     0

I don't see a fundamental difference between the goof-proof check above and what I am asking for (unless, again, it's not feasible or is too difficult and/or time consuming...)

FransUrbo commented 10 years ago

We obviously have very different philosophies of what well-behaved software should do.

We obviously also have different understanding of the word 'well-behaved'. As I see it, there's a difference between 'well-behaved' (does what it's supposed to as opposed to ... creating a pool when asked to upgrade one for example - not very well behaved) and 'overprotecting' (tries to protect the user so he/she stops thinking before he/she types).

When doing something possibly dangerous, I ALWAYS look at the command line and if it looks the least "weird" or "that might not be right", I double check with the man page. And if that's dubious, then I ask. BEFORE I press enter...

Sudo [used to?] have a comment when used: "think before you type". Words to live by.

but people DO make honest mistakes

Then I would argue that we need a way to disable and/or deactivate a feature. And/or to downgrade a pool version (all which might actually be practically impossible - but deactivating a unused feature I think is the least what we need).

With such a feature(s), most of what you're asking will be taken care of.

I don't see a fundamental difference between the goof-proof check above and what I am asking for (unless, again, it's not feasible or is too difficult and/or time consuming...)

Thing is, if we open that door, we need to go though it and bring with us feature checks (which have a much higher risk of being a problem). And that means that someone needs to maintain a "compability chart". Not to mention that going through possibly thousands of filesystems... This will (possibly) require a lot of memory, but mostly time.

lundman commented 10 years ago

That we have a pool Solaris can use, but not OpenZFS, is expected. That we can upgrade it into OpenZFS, and be incompatible with Solaris, is expected.

That it then becomes unusable by every ZFS implementation that exists, is not acceptable.

olw2005 commented 10 years ago

Two schools of thought here, the nanny state vs. Laissez-faire. Who's to say which is correct?

But to be fair, a warning of some sort probably came up about the incompatibility (fs=6) of pool/I_AM_VERSION_6 when the pool was imported. [At least it sounds like it based on the above description.]

If so, the warning bells should have been going off at that point. Time to step back, do a little reading / investigation. "Look before you leap." If a user instead chooses to ignore that and proceeds to shoot themselves in the foot... wince Hmm. That's unfortunate. Did you backup your pool first?

My $0.02.

dswartz commented 10 years ago

Two schools of thought here, the nanny state vs. Laissez-faire. Who's to say which is correct?

I see your point, but this is a bit of a false dichotomy, IMO - there is a whole spectrum of self-inflicted injury, ranging from completely blameless to 'what the **\ were you thinking of????' :)

But to be fair, a warning of some sort probably came up about the incompatibility (fs=6) of pool/I_AM_VERSION_6 when the pool was imported. [At least it sounds like it based on the above description.]

No idea...

If so, the warning bells should have been going off at that point. Time to step back, do a little reading / investigation. "Look before you leap." If a user instead chooses to ignore that and proceeds to shoot themselves in the foot... wince Hmm. That's unfortunate. Did you backup your pool first?

No idea, keep in mind (as I said earlier), I'm not the individual in question - just passing this along for him.

dswartz commented 10 years ago

Okay, so I looked into why the person upgraded the pool. I'm inclined to think the wording of a certain message helped him shoot himself in the foot. This is the message that comes out if you try to import a v6 filesystem on a v28 pool on ZoL:

"cannot mount 'pool/storage': Can't mount a version 6 file system on a version 28 pool. Pool must be upgraded to mount this file system."

So, he did upgrade the pool. Oops. Looking at the check in zfs_vfsops.c, it is comparing the version of the filesystem in question with the maximum filesystem version this pool version supports. Nowhere does it check if the current implementation can handle the filesystem version. There should probably be another check just prior that checks if the filesystem version is greater than the filesystem version this zfs module supports?

olw2005 commented 10 years ago

Yikes! Hard to argue that error message. At the very least it's misleading -- it implies the filesystem CAN be imported post-upgrade. If there's a spot in need of correction that would be it.

FransUrbo commented 10 years ago

Nowhere does it check if the current implementation can handle the filesystem version.

NOW we're talking about a real bug! Of course it shouldn't say that an upgrade will fix that.. "It" should know that "it" doesn't support a v6 FS...

dswartz commented 10 years ago

Agreed! I've coded a minimal fix to address this. Testing now... (delay was in getting a solaris 11.2 ISO downloaded and a VM created...)

dswartz commented 10 years ago

Okay, looking good. Here is the repro. Created a pool with fs=5 and pool=28 under solaris 11.2. Created a filesystem on that pool with no version. Confirmed it was fs=6. Exported pool and powered off solaris 11 VM. Moved vmdk to CentOS7 VM and tried to import pool. Here are the before and after messages in the kernel log:

BEFORE

Aug 22 16:01:50 centos7-ha1 kernel: Can't mount a version 6 file system on a version 28 pool. Pool must be upgraded to mount this file system.

AFTER

Aug 22 16:06:08 centos7-ha1 kernel: Version 6 file system is not supported.

dswartz commented 10 years ago

Created pull request #2620.

olw2005 commented 10 years ago

Nicely done, sir. ;-)

dswartz commented 10 years ago

How about this:

Aug 23 18:56:31 zolbuild kernel: Can't mount a version 6 file system on this pool. File system version is not currently supported by this ZFS implementation.

p.s. turns out I needed to add '\n' to the end of the printk or things get run together. Amusingly, the other message (about the pool number and upgrades) is also screwed up due to having no newline. I am refreshing the commit...

FransUrbo commented 10 years ago

Looks good... Add a

`` Reviewed by: Turbo Fredriksson turbo@bayour.com

dswartz commented 10 years ago

New at this. A ".."?

Turbo Fredriksson notifications@github.com wrote:

Looks good... Add a

`` Reviewed by: Turbo Fredriksson turbo@bayour.com

— Reply to this email directly or view it on GitHub.

FransUrbo commented 10 years ago

Reviewed by: Turbo Fredriksson turbo@bayour.com

In the log comment. Where you need to add your

Signed-off-by: <you>

Look at other log/commit messages - "git log".

dswartz commented 10 years ago

Ah ok thx

Turbo Fredriksson notifications@github.com wrote:

Reviewed by: Turbo Fredriksson turbo@bayour.com

In the log comment. Where you need to add your

Signed-off-by:

Look at other log/commit messages - "git log".

— Reply to this email directly or view it on GitHub.

FransUrbo commented 10 years ago

Ah ok thx

Doesn't mean much, but it might help to spread the blame when/if .... :)

FransUrbo commented 10 years ago

Also, don't forget to add a "Closes: #2616" AND (!!) rebase them into ONE commit (and then do a force push to your branch).

behlendorf commented 10 years ago

@dswartz Thanks for digging in to this and proposing a minimal fix. What you're suggesting is definitely and improvement but I think we can do better reasonably easily. My suggestion here comes in two parts.

1) Improve the existing 'Can't mount a version ...' error message by moving it out of zfs_sb_create() and in to the zfs.mount helper utility. This message was located here in the original code base only because there really wasn't a better place to log it in Illumos. However, on Linux we ship a zfs.mount helper which can be used to handle an ENOTSUP returned by mount(2). This would allow us to print a reasonable error to the users terminal instead of their console. See the existing ENOENT, EBUSY, error handing at the end of main() in cmd/mount_zfs/mount_zfs.c, a ENOTSUP case could be added here. Additionally the spa and dataset versions are both available in properties as is the zfs_zpl_version_map() function. We should also be very explicit in the error, if the zfs_zpl_version_map() returns -1 then upgrading the pool will not help. Otherwise the minimum required (and supported) pool version is returned.

2) Personally I'm of the school of thought that if we can reasonably prevent users from shooting themselves in the foot we should. If they really want to do something in spite of the consequences they can supply the -f force option. Therefore, I suggest extending the zpool_do_upgrade function to recursively check all of its filesystems to ensure they're compatible with the new pool version. Using the zfs_iter_root() function in upgrade_version() should allow you to iterate over the datasets and log an appropriate error.

@dswartz does this sound like something you're willing to work on?

As an aside, the original damaged pool might be savable with a little hacking. All the zpool upgrade command really does is update the pool version on disk. Further on-disk changes will only happen once the file system mounted and actually used. The zpool upgrade utility could be hacked to allow you to set the pool version back to v28 where Solaris can import the pool. The pool should still be compatible.

dswartz commented 10 years ago

@dswartz Thanks for digging in to this and proposing a minimal fix. What you're suggesting is definitely and improvement but I think we can do better reasonably easily. My suggestion here comes in two parts.

1) Improve the existing 'Can't mount a version ...' error message by moving it out of zfs_sb_create() and in to the zfs.mount helper utility. This message was located here in the original code base only because there really wasn't a better place to log it in Illumos. However, on Linux we ship a zfs.mount helper which can be used to handle an ENOTSUP returned by mount(2). This would allow us to print a reasonable error to the users terminal instead of their console. See the existing ENOENT, EBUSY, error handing at the end of main() in cmd/mount_zfs/mount_zfs.c, a ENOTSUP case could be added here. Additionally the spa and dataset versions are both available in properties as is the zfs_zpl_version_map() function. We should also be very explicit in the error, if the zfs_zpl_version_map() returns -1 then upgrading the pool will not help. Otherwise the minimum required (and supported) pool version is returned.

2) Personally I'm of the school of thought that if we can reasonably prevent users from shooting themselves in the foot we should. If they really want to do something in spite of the consequences they can supply the -f force option. Therefore, I suggest extending the zpool_do_upgrade function to recursively check all of its filesystems to ensure they're compatible with the new pool version. Using the zfs_iter_root() function in upgrade_version() should allow you to iterate over the datasets and log an appropriate error.

@dswartz does this sound like something you're willing to work on?

Sure.

As an aside, the original damaged pool might be savable with a little hacking. All the zpool upgrade command really does is update the pool version on disk. Further on-disk changes will only happen once the file system mounted and actually used. The zpool upgrade utility could be hacked to allow you to set the pool version back to v28 where Solaris can import the pool. The pool should still be compatible.

It turned out he was able to salvage the data by creating a scratch pool on a solaris 11.2 box, creating a snapshot of the problematic filesystem and sending it over.

FransUrbo commented 10 years ago

1) Improve the existing 'Can't mount a version ...' error message by moving it out of zfs_sb_create() and in to the zfs.mount helper utility.

How about in both places? So we get one in the log (for "posterity") and in the terminal?

behlendorf commented 10 years ago

How about in both places? So we get one in the log (for "posterity") and in the terminal?

We could, but it would be nice to rid ourselves of that printk. We shouldn't need to log this to the console where most people won't look for it.

FransUrbo commented 10 years ago

We could, but it would be nice to rid ourselves of that printk. We shouldn't need to log this to the console where most people won't look for it.

You mean get rid of BOTH printk()s - the current one and the new one?

I'm all for moving it, I really don't care either way really. I just figured "it could be nice to have" - since the code is already there, and doesn't hurt, just adding "copies" of it in a second place didn't seem to out of place...

behlendorf commented 10 years ago

Yes. I was thinking just drop the printk in current zfs_sb_create() function and just return ENOTSUP. We don't need to change the existing logic here at all. Then in the mount helper for the ENOTSUP case we can print a clear message about why this failed and whether or not a pool upgrade would help.

dswartz commented 10 years ago

Yes. I was thinking just drop the printk in current zfs_sb_create() function and just return ENOTSUP. We don't need to change the existing logic here at all. Then in the mount helper for the ENOTSUP case we can print a clear message about why this failed and whether or not a pool upgrade would help.

This was what I thought you meant, so am proceeding on that basis...

dswartz commented 10 years ago

Okay, looking good. I have a couple of minor changes yet, but here is the output of 'zfs mount -a':

filesystem 'test/SOLARIS-ONLY' is not supported by this implementation of ZFS cannot mount 'test/SOLARIS-ONLY': Resource temporarily unavailable

FransUrbo commented 10 years ago

filesystem 'test/SOLARIS-ONLY' is not supported by this implementation of ZFS cannot mount 'test/SOLARIS-ONLY': Resource temporarily unavailable

Shouldn't this be the other way around (why first, reason after)?

dswartz commented 10 years ago

filesystem 'test/SOLARIS-ONLY' is not supported by this implementation of ZFS cannot mount 'test/SOLARIS-ONLY': Resource temporarily unavailable

Shouldn't this be the other way around (why first, reason after)?

this is how the zfs mount helper works :( Before we exit, we print a specific message to stderr, and return a specific code. the code that invoked us then prints a mount failed message. not happy about the EAGAIN, but that was already there for any failure of type MOUNT_SYSERR...

dswartz commented 10 years ago

I have no way to reproduce the original 'zfs version is incompatible with this pool version' scenario, so I faked it by changing the zfs_version to 99 in the 'if' statement that decides what to print. Here is the output for that:

filesystem 'test/SOLARIS-ONLY' (version 6) cannot be mounted on this version 28 pool. Upgrade pool. cannot mount 'test/SOLARIS-ONLY': Resource temporarily unavailable

dswartz commented 10 years ago

I'm just about ready to push an initial commit. Need to do some more thorough testing. Here's a philosophical question: do we really need to do that 'this filesystem is of a version that the pool doesn't support, so you need to upgrade the pool!' check? I ask because if that was really the case, how on earth did the user manage to create that filesystem to begin with? I'm wondering if this was some kind of legacy check from way back, before all the fs/pool version stuff got sorted out?

FransUrbo commented 10 years ago

how on earth did the user manage to create that filesystem to begin with?

Backwards compatibility - user is/was running an older version (maybe on Solaris or whatever). Doesn't need to be a "newly" created filesystem...

dswartz commented 10 years ago

how on earth did the user manage to create that filesystem to begin with?

Backwards compatibility - user is/was running an older version (maybe on Solaris or whatever). Doesn't need to be a "newly" created filesystem...

It would have to be pretty old, though. Neither solaris nor omnios (haven't tried zol or freebsd) will allow you to create a filesystem on a pool that doesn't support that fs version. More curious than anything else...

FransUrbo commented 10 years ago

Neither solaris nor omnios (haven't tried zol or freebsd) will allow you to create a filesystem on a pool that doesn't support that fs version.

All ZFS implementation can create an older filesystem:

# truncate -s 1T /var/tmp/zfs_test-4
# zpool create test3 /var/tmp/zfs_test-4
# zfs get version test3

NAME PROPERTY VALUE SOURCE test3 version 5 -

zfs create -o version=4 test3/test

# zfs get version test3/test

NAME PROPERTY VALUE SOURCE test3/test version 4 -

dswartz commented 10 years ago

But that isn't the check I'm talking about. The check in question asks 'is the version of this filesystem higher than the maximum supported by the current pool version'...

dswartz commented 10 years ago

If we allow an upgrade, we need to allow specification of the '-f' flag to 'zpool upgrade', which is currently not present. I'd need to change a few lines of code here and there, as well as the manpage for zpool. Let me know if you want me to do so, Brian...

FransUrbo commented 10 years ago

No, you asked about

this filesystem is of a version that the pool doesn't support, so you need to upgrade the pool!

completely different...

YOUR change is about forward compability:

the version of filesystem is higher than the maximum supported by the current pool version
dswartz commented 10 years ago

Wow, now you have me confused. You asserted that you can create old versions of filesystems on a pool (e.g. lower version number than the max the pool supports.) I agree with that. When I said 'filesystem of a version that the pool does not support', by definition that means a higher one, since if it was a lower one, the pool would support it, no? This is not just my opinion, this is exactly what the existing check is doing!

FransUrbo commented 10 years ago

Hmm, ok I'm starting to be confused to... Could it be that since the default is (currently) '5', it notice that it's a lower FS and is therefor triggering that?

Or it means that "we're not supporting that properly, you better upgrade"... ?

What code exactly are you talking about?

dswartz commented 10 years ago

Hmm, ok I'm starting to be confused to... Could it be that since the default is (currently) '5', it notice that it's a lower FS and is therefor triggering that?

Or it means that "we're not supporting that properly, you better upgrade"... ?

What code exactly are you talking about?

In the mount code, there is this check:

   } else if (zsb->z_version >
       zfs_zpl_version_map(spa_version(dmu_objset_spa(os)))) {
           (void) printk("Can't mount a version %lld file system "
               "on a version %lld pool\n. Pool must be upgraded to

mount " "this file system.", (u_longlong_t)zsb->z_version, (u_longlong_t)spa_version(dmu_objset_spa(os)));

e.g. it fetches the filesystem version, and if it is greater than the highest filesystem version supported by the current pool version, it rejects the mount with a message telling you to upgrade. This is how the guy got screwed - his solaris fs version was 6, and the pool maximum version was

  1. The zfs_zpl_version_map() routine will take a pool version of 28, and return 5. Since 6 > 5, we get the 'pool must be upgraded' message, despite the fact that this not only won't help, it may screw you badly.
FransUrbo commented 10 years ago

I actually have no idea how to create such a filesystem on such a pool:

# zpool create -o version=16 -O version=5 test3 /var/tmp/zfs_test-4
cannot create 'test3': operation not supported on this type of pool

This is trying to create a version 16 pool version with a version 5 filesystem. IF this had worked, that "upgrade pool" check would have been triggered...

Doing it in separate stages don't seem to work either...

# zpool create -o version=16 test3 /var/tmp/zfs_test-4
# zpool get version test3

NAME PROPERTY VALUE SOURCE test3 version 16 local

zfs create -o version=5 test3/test

cannot create 'test3/test': pool must be upgraded to set this property or value

Receiving it doesn't work either...

# zfs get version test1/test9

NAME PROPERTY VALUE SOURCE test1/test9 version 5 -

zfs list -tall -r test1/test9

NAME USED AVAIL REFER MOUNTPOINT test1/test9 41K 29.3G 25K /test1/test9 test1/test9@snap9 16K - 25K -

zfs send test1/test9@snap9 | zfs receive -dv test3

receiving full stream of test1/test9@snap9 into test3/test9@snap9 cannot receive new filesystem stream: pool must be upgraded to receive this stream. warning: cannot send 'test1/test9@snap9': Broken pipe

But this is exactly what the code say, from what I can understand:

zsb->z_version > zfs_zpl_version_map(spa_version(dmu_objset_spa(os))))

Let's wait and see what @behlendorf have to say. Maybe we both misunderstood something :)