Open borzel opened 5 years ago
Hm getting : Host Xen XCP 7.6.3 , NFSv4 enabled Server Debian 9. exports file contains. /srv/nfs4 192.168.xxx.0/24(rw,sync,fsid=0,crossmnt,no_subtree_check,sec=sys:krb5:krb5i:krb5p) /srv/nfs4/xcphosts 192.168.xxx.0/24(rw,sync,no_subtree_check,sec=sys:krb5:krb5i:krb5p)
This current setup works for v3/v4/v4.1 with and without kerberos mounts.
xe sr-probe type=nfs device-config:server=IP_HERE device-config:serverpath=/xen/nfs-stor device-config:probeversion Error code: SR_BACKEND_FAILURE_73 Error parameters: , NFS mount error [opterr=mount failed with return code 32],
running : xe sr-probe type=nfs device-config:server=IP_HERE device-config:probeversion
Error code: SR_BACKEND_FAILURE_101
Error parameters: , The request is missing the serverpath parameter, <?xml version="1.0" ?>
<nfs-exports>
<Export>
<Target>192.168.xxx.xxx</Target>
<Path>/srv/nfs4/xcphosts</Path>
<Accesslist>192.168.xxx.0/24</Accesslist>
</Export>
<Export>
<Target>192.168.xxx.xxx</Target>
<Path>/srv/nfs4</Path>
<Accesslist>192.168.xxx.0/24</Accesslist>
</Export>
</nfs-exports>
@borzel Is that a problem that occurs for any NFS share that supports NFS 4 or above, or only with specific servers?
Not sure if it is related to the same root cause however: when xe sr-create specifies only type=nfs, it defaults to NFSv3 and will not negotiate NFSv4/NFSv4.1 share without adding device-config:nfsversion=4.1 (which isn't documented as far as I can tell). IMHO it should be starting with NFSv4.1 and negotiating downwards. (XCP-ng 8.1 connecting to nfs-kernel-server on Ubuntu 20.04)
I had similar problem with FreeBSD NFS server set up with minimal NFS version 4 (vfs.nfsd.server_min_nfsvers=4 in the /etc/sysctl.conf file) [1].
I was told [2], that NFS v4 doesn't use RPC, so if support for older protocol version isn't needed, nfsd would not register with rcpbind, making function check_server_service in /opt/xensource/sm/nfs.py unreliable and invalid, because it checks for condition (nfs service in rpcinfo -s output) which is not always present.
[1] https://lists.freebsd.org/pipermail/freebsd-net/2021-January/057371.html [2] https://lists.freebsd.org/pipermail/freebsd-net/2021-January/057372.html
I've hit this bug with an NFSv4+ only server (eg. NFSv3 is disabled). XCP-ng is unable to add the SR because it depends on the presence of NFSv3 services.
Then I think it should be reported upstream ASAP :)
still actual for Huawei storage.
IIRC, this was fixed in a recent upstream SMAPI patch (but likely not yet available in XCP-ng. @stormi can you take a look where you are around? Thanks!
I don't remember commits that would address this, and the issue on the upstream repository got no answers from the devs.
Recent commits that are about NFS in sm
are: https://github.com/xapi-project/sm/commit/e1218647f0920e3d489c7155b823f45ca21715ea and https://github.com/xapi-project/sm/commit/6fbff68f74343c54c19b74c6bd3e66625d955495 but I don't think they are related to this issue here.
Any update on this? I have checked that the upstream already push a fix for this issues. but when I update my XCP-ng installation the file still not updated. So I take my own way and just replaced the driver file for NFSSR.py and nfs.py with the update. but still not working and xpc-ng center just giving me this error.
I really need NFS V4 to work as PETASAN instance only support NFS v4 and above for their NFS exports. Thanks.
Hi!
The fix wasn't available in XCP-ng 8.2.1, it'll be released soon but if you want to test it in advance: yum update sm sm-rawhba --enablerepo=xcp-ng-ci
Bear in mind it is a test build so not safe to run in production. Regards :)
Hi!
The fix wasn't available in XCP-ng 8.2.1, it'll be released soon but if you want to test it in advance:
yum update sm sm-rawhba --enablerepo=xcp-ng-ci
Bear in mind it is a test build so not safe to run in production. Regards :)
Thanks for the update. I will try it with my test server. may I know when the next update that the fix will be commited?
It's currently in the CI phase of our pipeline, so if everything goes smoothly I'd say a couple weeks. More if we find issues.
Any news on this one? I was actually surprised to see that NFSv4 only servers are an issue because XCP-ng manual states that NFS is preferred instead of iSCSI.
So I started the planning to move away from iSCSI and stumbled upon this issue.
@viniciusferrao I had a very bad experience with XCP-ng storage on iSCSI. For the last 3 years I have been using nfs 4.1 without any real difficulties (some performance concerns when VMs boot and until they stabilize, but I was never able to narrow down the cause). I am on XCP-ng v.8.2 which has the option to select nfs v3 v4 or v4.1 when creating a new nfs SR. I also use option "hard" which has pros and cons, there are other threads here on that subject.
@viniciusferrao I had a very bad experience with XCP-ng storage on iSCSI. For the last 3 years I have been using nfs 4.1 without any real difficulties (some performance concerns when VMs boot and until they stabilize, but I was never able to narrow down the cause). I am on XCP-ng v.8.2 which has the option to select nfs v3 v4 or v4.1 when creating a new nfs SR. I also use option "hard" which has pros and cons, there are other threads here on that subject.
But is there any workaround today? Because I tried to mount the volume and was affected by the issue on this ticket.
My XCP-ng dates back to 2013 when I originally installed XenServer 6.2. I've been updating it since then. The same for the storage system that's FreeNAS (at the time) and now TrueNAS. The disk pool was created in early 2014. Since the beginning, this pool is iSCSI and I had very expensive workloads on it, like Exchange 2010 and later 2013 with 700 user accounts, more than a TB of iSCSI mailboxes on top of XenServer virtual disks.
And now I was moving to NFS, due to the cited recommendation and I'm unable to.
How to mount the NFS share? What's the workaround? TrueNAS does not enables NFSv3 and v4 at the same time.
Thanks.
For me, it was just in Xen Orchestra: select the pool SR - create a new SR Select storage type: NFS Settings: Server (your NFS path) NFS version 4.1 NFS options (in my case, study the implications) hard Similar process in XCP-ng Center.
However your question makes me wonder if you are even talking about an XCP-ng storage repositiory - possibly connecting to an NFS server from a VM? I do that too, from Linux at least it's standard mount -t nfs4, no magic.
For me, it was just in Xen Orchestra: select the pool SR - create a new SR Select storage type: NFS Settings: Server (your NFS path) NFS version 4.1 NFS options (in my case, study the implications) hard Similar process in XCP-ng Center.
However your question makes me wonder if you are even talking about an XCP-ng storage repositiory - possibly connecting to an NFS server from a VM? I do that too, from Linux at least it's standard mount -t nfs4, no magic.
Yeah, this does not work. I'm affected by the bug on this thread. I thought you had a workaround for it. Your NFS server probably supports NFSv3 and v4 at the same time, which isn't my case.
Yes, the issue is when the NFS server doesn't advertise what it supports through rpcbind, which is a v3-only thing (rpcbing can report, although is not obligated to do so, also v4.x protocol versions, which explains why some users can select v4.x protocols when their server also supports v3).
We do have a fix for this, it is build, and is currently on a pre-release repository before it can be released to all users.
On XCP-ng 8.2, you can try it with:
yum update sm sm-rawhba --enablerepo=xcp-ng-ci,xcp-ng-testing,xcp-ng-candidates
Internal CI tests already ran successfully.
On XCP-ng 8.3, it should be already supported.
Yes, the issue is when the NFS server doesn't advertise what it supports through rpcbind, which is a v3-only thing (rpcbing can report, although is not obligated to do so, also v4.x protocol versions, which explains why some users can select v4.x protocols when their server also supports v3).
We do have a fix for this, it is build, and is currently on a pre-release repository before it can be released to all users.
On XCP-ng 8.2, you can try it with:
yum update sm sm-rawhba --enablerepo=xcp-ng-ci,xcp-ng-testing,xcp-ng-candidates
Internal CI tests already ran successfully.
On XCP-ng 8.3, it should be already supported.
Thank you @stormi. But may I ask if there's any timeline to it lands on stable channels? On 8.2.1 or 8.3?
On 8.2, it will go with the next train of updates, which is not scheduled yet. A few weeks maybe. It's already in XCP-ng 8.3, but 8.3 itself is still a (rather stable) beta.
This bug still exists in xcp-ng 8.3
Please elaborate, as it's actually fixed from our point of view. It's likely you have a different albeit similar issue.
What to say, attaching nfs share with v4 or v4.1 only works when the nfs share has v3 enabled, what you write earlier perfectly sums this issu up:
Yes, the issue is when the NFS server doesn't advertise what it supports through rpcbind, which is a v3-only thing (rpcbing can report, although is not obligated to do so, also v4.x protocol versions, which explains why some users can select v4.x protocols when their server also supports v3
Also worth noting this issue is occuring when attaching storage in xo also, with or without kerberose.
We have automated tests which precisely test a server which only has v4+ and no v3, so it's likely there's something else in the picture. @benjamreis how to debug this?
Probably sharing the error gotten while trying to probe or create the SR would be a good start -- even better the corresponding logs in xensource.log
and SMlog
:+1:
Give me some time, i will post the logs latertoday
This is log from XO storage when V4 and v4.1 only share in works:
remote.test { "id": "a74654a5-509b-4d6b-8a42-06e5713ed882
" } { "shortMessage": "Command failed with exit code 32: mount -o port=2049 -t nfs 172.16.10.10:/nfs/backup /run/xo-server/mounts/a74654a5-509b-4d6b-8a42-06e5713ed882
", "command": "mount -o port=2049 -t nfs 172.16.10.10:/nfs/backup /run/xo-server/mounts/a74654a5-509b-4d6b-8a42-06e5713ed882
", "escapedCommand": "mount -o \"port=2049\" -t nfs \"172.16.10.10:/nfs/backup\" \"/run/xo-server/mounts/a74654a5-509b-4d6b-8a42-06e5713ed882
\"", "exitCode": 32, "stdout": "", "stderr": "mount.nfs: Protocol not supported", "failed": true, "timedOut": false, "isCanceled": false, "killed": false, "message": "Command failed with exit code 32: mount -o port=2049 -t nfs 172.16.10.10:/nfs/backup /run/xo-server/mounts/a74654a5-509b-4d6b-8a42-06e5713ed882
mount.nfs: Protocol not supported", "name": "Error", "stack": "Error: Command failed with exit code 32: mount -o port=2049 -t nfs 172.16.10.10:/nfs/backup /run/xo-server/mounts/a74654a5-509b-4d6b-8a42-06e5713ed882
mount.nfs: Protocol not supported at makeError (/etc/xen-orchestra/node_modules/execa/lib/error.js:60:11) at handlePromise (/etc/xen-orchestra/node_modules/execa/index.js:118:26) at NfsHandler._sync (/etc/xen-orchestra/@xen-orchestra/fs/src/_mount.js:68:7)" }
This is log from a NFS SR attached with V3 V4 and V4.1 enabled, then disabled V3 and did a rescan of the SR
sr.scan { "id": "7a89bd71-8635-173f-54de-19684d061d4f" } { "code": "SR_BACKEND_FAILURE_47", "params": [ "", "The SR is not available [opterr=no such directory /var/run/sr-mount/7a89bd71-8635-173f-54de-19684d061d4f]", "" ], "task": { "uuid": "28f35853-1149-45ed-ca17-ad7ae65a8082
",
"name_label": "Async.SR.scan",
"name_description": "",
"allowed_operations": [],
"current_operations": {},
"created": "20241120T18:21:50Z",
"finished": "20241120T18:21:50Z",
"status": "failure",
"resident_on": "OpaqueRef:a2ff60a3-d6ce-465b-874c-be3d797ba33a",
"progress": 1,
"type": "
There might be a possibility that this error is caused by a issue in QNAP QTS version 5.2.1, this is unconfirmed but some googling indicates QNAP is crap as usual, i will test this with a dell powerstore and see if this is storage related, as i realy suspect now after testing
Hi,
Thx for the logs - unfortunantely XO doesn't provide all the necessay info of the error as its only a client of th XAPI.
What I asked was the returns of the xe sr-probe and sr-create calls and the log in /var/log/xensource.log
/var/log/SMlog
corresponding to the call.
The error does sm to indicate the mount is attempted on NFS3 for som reason... While you gather the logs i askd i'll take a look at the code again but as mentioned by @stormi - our CI does have a NFS4+ only tests that run successfully.
Background:
Steps:
xe sr-probe type=nfs device-config:server=<some-ip> device-config:serverpath=/mnt/testpool/nfsv41test device-config:probeversion
Output of sr-probe
Expected result Shows
SupportedVersions
3, 4 and 4.1