Open hammer-83 opened 2 years ago
Hi! You might reach more people that could potentially help on the forum: https://xcp-ng.org/forum/. Maybe there already exist old threads about it? I haven't checked.
Hi, thank you for the suggestion. I can (and will) ask, but I'm also fairly certain it is a bug. I've scouted all the potential resources on the net for the past week including the XCP-NG forum. The only two threads on the subject I could find are the ones I listed in the original post and their problem description is exactly the same. So I would have really liked a dev to take a look at the problem if possible because at least for me it is 100% reproducible. What I do not know is if it's my hardware, Xen, XCP-NG or TrueNAS.
High-level steps to reproduce:
open-iscsi
.iscsiadm --mode discoverydb --type sendtargets --portal [TrueNAS IP]:3260 --discover
at this step it should show the targets from TrueNAS - so far so good
iscsiadm --mode node --targetname <target IQN> --portal [TrueNAS IP]:3260 --login
In dmesg, you see the connection is successful and new block device is created in /dev. But after 5 seconds, TrueNAS console says no ping reply (NOP-Out) after 5 seconds: dropping connections
and the network on TrueNAS becomes inaccessible. Have to do xe vif-unplug
, xe vif-plug
on the TrueNAS vif to get the connectivity back. The exact same set of steps in VirtualBox work fine.
I'm not a hard-core expert in networking or in low-level OS stuff, but I do work in the industry and do my fair share of software/hardware troubleshooting. So I can perform technical steps to help debug this if necessary. I just do not know myself how to approach it. But if somebody more knowledgeable hints me where to start digging, I'm ready to do it to get to the bottom of this.
The community on the forum can help debugging the issue, which would raise the likeliness of a fix if there's a bug.
I came here since I have the exact same problem, while testing TrueNAS for future use. (I won't have it as a VM then, but during testing it is of course an easy route). I can add that the problem seems to be due to page crossing in xenvif_count_requests, in xen-netback/netback.c:
[318644.585621] vif vif-14-2 vif14.2: Cross page boundary, txp->offset: 0, size: 8900 [318644.585635] vif vif-14-2 vif14.2: fatal error; disabling device
Edit: the problem is in FreeBSD 13's netfront. It is fixed in 14, but TrueNAS core does not seem to go there. https://github.com/freebsd/freebsd-src/commit/dabb3db7a817f003af3f89c965ba369c67fc4910
First of all, I would like to say that I've spent several days debugging the issue and trying many different things without figuring out what's the cause so if it is not with XCP-NG but with TrueNAS, I apologize in advance and will file the ticket in their issue tracker.
The issue I'm having is that any time that an initiator attempts to mount an iscsi target, TrueNAS VM loses all network connectivity. There are two ways to restore it: xe vif-unplug, xe vif-plug or guest reboot. There are no anomalies to report prior to mounting the iSCSI target: network works, speeds are good, iSCSI discovery works fine. But as soon as target is mounted, the network interface on the TrueNAS VM seems to have all the routes cleared for that interface. Here are the things I tried:
I also tried to replicate the exact same setup of TrueNAS and an Ubuntu VM on internal network in VirtualBox, where everything works fine. This is what lead me to choose the hypervisor rather than TrueNAS as a project to report the issue to.
Finally, I found the following two threads on the net of users having the exact same issue but without any resolution:
https://forums.lawrencesystems.com/t/dropping-connection-xcp-ng-with-free-truenas-as-guest/6720/6 https://www.truenas.com/community/threads/network-loses-connectivity-when-iscsi-target-connected.87251/
There seems to be one walkaround which is to bridge the virtual network to a physical adaptor but ideally I would like to keep this traffic internal so that: 1. it stays private, 2. it is not limited by the physical port speed.
My XCP-NG is version 8.2 with all the updates applied as of the time of this writing. I can provide additional details as needed, just not sure what else might help here.