open-iscsi / targetd

Remote configuration of a LIO-based storage appliance
GNU General Public License v3.0
71 stars 28 forks source link

Docker #30

Closed Queuecumber closed 5 years ago

Queuecumber commented 5 years ago

Back again, hoping to add official Docker support

The image is based on the latest alpine and I've confirmed it works with the client script and with my own tests (e.g. running an ubuntu VM based on a volume provisioned by targetd in the container).

Ideally this would get build and uploaded to Dockerhub as an official image that people can use instead of having to get it from some 3rd party person

The PR should be held

Queuecumber commented 5 years ago

There is, however, one bug that I'd love some help figuring out: the exports don't seem to persist across container restarts. To be clear, targetcli shows the export with all the correct settings if the conainer is running or not, but it becomes inaccessible if I restart the container without clearing the export manually (re-exporting is not enough).

The procedure I follow is:

  1. Start the docker container
  2. Create an export (doesn't matter if targetd created the volume or not)
  3. Verify that I can connect to the target and see the export
  4. Shut down the container
  5. Restart the container
  6. I can no longer connect to the target, the error is "failed to log in to any nodes"
  7. Re-create the export using the API
  8. Still cant connect with same error message
  9. Clear the export on the host using targetcli
  10. Re-create the export using the API
  11. Error goes away and target is accessible

I have verified that this bug does not occur when targetd is run on the host normally, it is only when it is run through the docker container. Is there some place where the export information is cached?

Queuecumber commented 5 years ago

OK based on a lot of experimenting and going over the rtslib source code I think the problem is that whatever is doing the serving of the iscsi information is running in the docker container and that is losing its state since the container is mostly ephemeral. The stuff in /sys/kernel/config/target is persisted but I guess it isnt getting its information from there. Does that make sense?

tasleson commented 5 years ago

@Queuecumber I just took a quick peek at package target-restore which does the LIO restore at boot.

# rpm -ql target-restore
/etc/target
/etc/target/backup
/usr/bin/targetctl
/usr/lib/systemd/system/target.service
/usr/share/man/man5/saveconfig.json.5.gz
/usr/share/man/man8/targetctl.8.gz
/var/target
/var/target/alua
/var/target/pr

This is just a guess at this point, but you may want to try adding /var/target to allowed access.

If that doesn't work, I think running /usr/bin/targetctl restore with strace might be useful on a regular host and then comparing to running when the configuration was done within the container.

What also might be helpful would be to run targetcli within the container and compare it to running outside of the container while utilizing strace.

Queuecumber commented 5 years ago

So I actually tried adding /var/target after seeing it mentioned in the rtslib source code, but it didn't fix the problem (and actually may have caused some weird issue with iscsiadm but I'm not totally sure).

Im also currently testing this in "quasi-production" on a kubernetes cluster and I'm getting connection refused when trying to connect to either the pods or the nodes themselves.

Do you happen to know which component is the one that's actually doing the socket listening? Is it the kernel module or is it a userspace daemon? I know with NFS there are some kernel components but the actual socket is opened from a userspace daemon which is how I was able to containerize it. I'm not sure if LIO works the same way though.

I'll give targetctl a shot but I thought all it did was restore the LIO information from a file similarly to how rtslib creates it using a python API. So the fact that I see the LIO information on the host preserved across container restarts makes me think that's not it.

Queuecumber commented 5 years ago

Running it in host networking mode seems to resolve these issues although I'm not sure why. Based on my reading of the LIO code it is definitely the kernel module that is doing the listening. I have no idea why host networking would make a difference here.

Queuecumber commented 5 years ago

OK I have a theory

When the export is created using rtslib it creates points to the network interfaces that it can see to listen on. This means that if you create the export in a docker container, it only sees the docker containers internal virtual interface and listens on that. This interface is destroyed with the container so even though LIO thinks it has a share set up, it isn't actually listening because its interface is gone. Recreating the docker container recreates this interface but since the previous one was destroyed LIO doesn't reconfigure itself to use the new one because it couldn't know how to do that without being explicitly told. I bet targetcli would fix this, but I currently can't get it to install in alpine edge (see https://bugs.alpinelinux.org/issues/10670). Host networking fixes this as well because rtslib can see and instruct LIO to listen on the real network interfaces.

The only thing this doesn't explain is why my kubernetes pods aren't listening, although there are enough differences between docker networking and kubernetes networking that something might be getting messed up there.

tasleson commented 5 years ago

Do you happen to know which component is the one that's actually doing the socket listening? Is it the kernel module or is it a userspace daemon? I know with NFS there are some kernel components but the actual socket is opened from a userspace daemon which is how I was able to containerize it. I'm not sure if LIO works the same way though.

You need to clarify which listening port, the one for targetd configuration or the iSCSI target?

This is what I understand about listening sockets

It would seem to me that yes you would need docker host networking, so to allow external clients of targetd the ability to configure/manage the targets. As for the iSCSI target(s), I would think that would show up on the host regardless of container networking mode, as it originates from the kernel. However, you can control which IP addresses the iSCSI targets appear on (portal), thus you could constrain it to different networks.

Hope this helps

Queuecumber commented 5 years ago

I am talking about the iSCSI target port, the API port is easy

It is strange but if you don't use host networking then netstat -tulpn on the host shows nothing while netstat -tulpn in the container shows the listen on 3260 and docker is able to proxy the 3260 port correctly (e.g. the exports are listable and mountable by external clients).

Queuecumber commented 5 years ago

OK so based on the last thing I wrote, I went back to the kubernetes pod to run some tests and I found that indeed LIO is listening on port 3260 on the pods interface, and in fact I can use iscsiadm discovery to see that there is an initiator on the service with the correct IQN set up etc. Unfortunately where it breaks is when I try to login with iscsiadm, then I get "no records found". I have no idea why the discovery would work but the login would fail.

Queuecumber commented 5 years ago

And it seems like the trick to making it work through kubernetes service proxies is to allow it to "listen" on the service IP address.

So from what I can gather, heres the process

The first thing iscsiadm does is to trigger discovery which talks to the server at the given address and asks it what iscsi targets it knows about. LIO responds with whatever it has configured, except that since we configure it to listen on 0.0.0.0, it queries the IP address of the interface its using. This gives back the pod IP which is not externally routable. iscsiadm then takes the return value and tries to log in using that pod IP which of course fails.

So what I did was to log into that machine and manually using targetctl to add an additional listen address which is equal to the services loadBalancerIP, which is externally routable. After triggering discovery again, both the pod IP and the new external IP are returned as targets. Then, login tries both addresses and it is ultimately able to log in and use the disks.

I confirmed this by exporting a disk from the targetd pod and installing an ubuntu VM using libvirt from a remote machine onto that disk.

So to summarize, what we need is a way to manually specify an additional listen address. I am happy to prepare a PR for that, I was thinking of adding an optional parameter to the export_create endpoint. Does that sound OK to you?

tasleson commented 5 years ago

So to summarize, what we need is a way to manually specify an additional listen address. I am happy to prepare a PR for that, I was thinking of adding an optional parameter to the export_create endpoint. Does that sound OK to you?

At the moment I'm not sure how we could provide a backwards compatible json RPC call for export_create. I'm leaning towards adding a new call which adds this parameter.

Queuecumber commented 5 years ago

New call works for me

Queuecumber commented 5 years ago

Given that https://bugs.alpinelinux.org/issues/10670 is resolved I'm going to include targetcli and a script which reloads and saves the config on container start and stop, that should resolve the missing exports on restart that I was seeing

tasleson commented 5 years ago

@Queuecumber This PR looks good to me, ready for merge?

Queuecumber commented 5 years ago

Yes it's ready now

tasleson commented 5 years ago

@Queuecumber Thanks!

Queuecumber commented 5 years ago

Btw do you think you'll push an official image?

tasleson commented 5 years ago

@Queuecumber Maybe once I figure out what's involved with that :-) I don't use containers much.

Queuecumber commented 5 years ago

I'm happy to take care of that as well if you want, i dont see and open-iscsi group on docker hub yet, is there some process to go through to get one made?

tasleson commented 5 years ago

@Queuecumber Sure, go ahead, thanks! I'm not familiar with the process.