kubevirt / containerized-data-importer

Data Import Service for kubernetes, designed with kubevirt in mind.
Apache License 2.0
407 stars 256 forks source link

CDI fails in GCP when importing Fedora qcow2 image from Fedora CDN #994

Closed alosadagrande closed 4 years ago

alosadagrande commented 4 years ago

Is this a BUG REPORT or FEATURE REQUEST?:

Uncomment only one, leave it on its own line:

/kind bug /kind enhancement

What happened: Everytime I try to import a Fedora qcow2 image to my Kubernetes cluster in Google Cloud Platform (GCP) I get a 404 error. All traceback can be seen here https://gist.github.com/alosadagrande/35d85aa9908a0446a868e41041f0bec1

What you expected to happen: Import correctly any Fedora qcow2 images from Fedora CDN in Google Cloud Platform (GCP)

How to reproduce it (as minimally and precisely as possible): Install a Kubernetes cluster Install Kubevirt Install CDI Import the following pvc: https://raw.githubusercontent.com/kubevirt/kubevirt.github.io/master/labs/manifests/pvc_fedora.yml

Basically I am just following this lab (https://kubevirt.io/labs/kubernetes/lab2.html)

Anything else we need to know?: There are a couple of things:

curl -IL https://download.fedoraproject.org/pub/fedora/linux/releases/30/Cloud/x86_64/images/Fedora-Cloud-Base-30-1.2.x86_64.qcow2 HTTP/1.1 302 Found Date: Wed, 16 Oct 2019 14:00:07 GMT Server: Apache/2.4.38 (Fedora) mod_wsgi/4.6.4 Python/2.7 X-Frame-Options: SAMEORIGIN X-Xss-Protection: 1; mode=block X-Content-Type-Options: nosniff Referrer-Policy: same-origin Location: https://mirror.umd.edu/fedora/linux/releases/30/Cloud/x86_64/images/Fedora-Cloud-Base-30-1.2.x86_64.qcow2 Content-Type: text/html; charset=UTF-8 AppTime: D=4059 X-Fedora-ProxyServer: proxy14.fedoraproject.org X-Fedora-RequestID: XaciZziIA@GS2qJ1ifD@OQAAARA

HTTP/1.1 200 OK Server: nginx/1.17.4 Date: Wed, 16 Oct 2019 14:00:07 GMT Content-Type: application/octet-stream Content-Length: 332267520 Last-Modified: Fri, 26 Apr 2019 02:13:11 GMT Connection: keep-alive ETag: "5cc26937-13ce0000" Accept-Ranges: bytes

Environment:

mhenriks commented 4 years ago

@alosadagrande the import should be retried (with exponential backoff) until it succeeds i.e. download.fedoraproject.org does not redirect to a non existent URL.

alosadagrande commented 4 years ago

I do not think it is a retry issue. It is been running all day trying with multiple Fedora qcow2 images and none has been imported successfully.

I was able to import a couple of raw ones from Fedora SDN without issue. I believe mirrors are the same for both.

I just tested again. Take a look at results:

3 different pvc linked to 3 different images (Fedora29 qcow2, Fedora30 qcow2 and Fedora30 raw):

[root@kubevirt-alosadag-nested ~]# kubectl get pods NAME READY STATUS RESTARTS AGE importer-fedora-29-pshh8 1/1 Running 4 3m42s importer-fedora-pf5lq 0/1 Error 5 3m28s importer-fedora-raw-7rcn4 1/1 Running 0 9s

Taking a look at logs:

[root@kubevirt-alosadag-nested ~]# kubectl logs -f importer-fedora-29-pshh8 I1016 15:18:24.302284 1 importer.go:51] Starting importer I1016 15:18:24.302616 1 importer.go:107] begin import process I1016 15:18:25.241349 1 data-processor.go:252] Calculating available size I1016 15:18:25.246345 1 data-processor.go:260] Checking out file system volume size. I1016 15:18:25.246373 1 data-processor.go:264] Request image size not empty. I1016 15:18:25.246395 1 data-processor.go:269] Target size 4Gi. I1016 15:18:25.246497 1 data-processor.go:182] New phase: Convert I1016 15:18:25.246512 1 data-processor.go:188] Validating image I1016 15:18:26.707821 1 qemu.go:212] 0.00

kubectl logs -f importer-fedora-pf5lq I1016 15:22:41.686092 1 importer.go:51] Starting importer I1016 15:22:41.686439 1 importer.go:107] begin import process I1016 15:22:42.233467 1 data-processor.go:252] Calculating available size I1016 15:22:42.239891 1 data-processor.go:260] Checking out file system volume size. I1016 15:22:42.239917 1 data-processor.go:264] Request image size not empty. I1016 15:22:42.239938 1 data-processor.go:269] Target size 4Gi. I1016 15:22:42.240015 1 data-processor.go:182] New phase: Convert I1016 15:22:42.240026 1 data-processor.go:188] Validating image I1016 15:22:42.767332 1 qemu.go:212] 0.00 E1016 15:22:43.396355 1 prlimit.go:164] qemu-img failed output is: E1016 15:22:43.397248 1 prlimit.go:165] (0.00/100%) qemu-img: curl: The requested URL returned error: 404 Not Found qemu-img: Could not open 'json: {"file.driver": "https", "file.url": "https://download.fedoraproject.org/pub/fedora/linux/releases/30/Cloud/x86_64/images/Fedora-Cloud-Base-30-1.2.x86_64.qcow2", "file.timeout": 3600}': Could not read image for determining its format: Input/output error E1016 15:22:43.397327 1 data-processor.go:179] exit status 1 qemu-img execution failed

But I can dowload it running wget:

wget https://download.fedoraproject.org/pub/fedora/linux/releases/30/Cloud/x86_64/images/Fedora-Cloud-Base-30-1.2.x86_64.qcow2 --2019-10-16 15:26:05-- https://download.fedoraproject.org/pub/fedora/linux/releases/30/Cloud/x86_64/images/Fedora-Cloud-Base-30-1.2.x86_64.qcow2 Resolving download.fedoraproject.org (download.fedoraproject.org)... 209.132.190.2, 152.19.134.198, 8.43.85.67, ... Connecting to download.fedoraproject.org (download.fedoraproject.org)|209.132.190.2|:443... connected. HTTP request sent, awaiting response... 302 Found Location: https://mirrors.rit.edu/fedora/fedora/linux/releases/30/Cloud/x86_64/images/Fedora-Cloud-Base-30-1.2.x86_64.qcow2 [following] --2019-10-16 15:26:05-- https://mirrors.rit.edu/fedora/fedora/linux/releases/30/Cloud/x86_64/images/Fedora-Cloud-Base-30-1.2.x86_64.qcow2 Resolving mirrors.rit.edu (mirrors.rit.edu)... 129.21.171.72, 2620:8d:8000:15:225:90ff:fefd:344c Connecting to mirrors.rit.edu (mirrors.rit.edu)|129.21.171.72|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 332267520 (317M) [application/octet-stream] Saving to: ‘Fedora-Cloud-Base-30-1.2.x86_64.qcow2.1’

25% [============================> ] 83,599,360 79.7MB/s

[root@kubevirt-alosadag-nested ~]# kubectl logs -f importer-fedora-raw-7rcn4 I1016 15:20:04.023948 1 importer.go:51] Starting importer I1016 15:20:04.026081 1 importer.go:107] begin import process I1016 15:20:04.541594 1 data-processor.go:252] Calculating available size I1016 15:20:04.547326 1 data-processor.go:260] Checking out file system volume size. I1016 15:20:04.547365 1 data-processor.go:264] Request image size not empty. I1016 15:20:04.547386 1 data-processor.go:269] Target size 4Gi. I1016 15:20:04.651261 1 data-processor.go:182] New phase: TransferDataFile I1016 15:20:04.655395 1 util.go:169] Writing data... I1016 15:20:05.652914 1 prometheus.go:67] 0.20 I1016 15:20:06.653721 1 prometheus.go:67] 0.21 I1016 15:20:07.653875 1 prometheus.go:67] 0.94 I1016 15:20:08.654779 1 prometheus.go:67] 1.80 I1016 15:20:09.654980 1 prometheus.go:67] 2.60 I1016 15:20:10.655130 1 prometheus.go:67] 3.43 I1016 15:20:11.655695 1 prometheus.go:67] 4.27 I1016 15:20:12.655873 1 prometheus.go:67] 5.18 I1016 15:20:13.656027 1 prometheus.go:67] 6.07 I1016 15:20:14.656201 1 prometheus.go:67] 6.93 I1016 15:20:15.656774 1 prometheus.go:67] 7.83 I1016 15:20:16.656947 1 prometheus.go:67] 8.76 I1016 15:20:17.657130 1 prometheus.go:67] 9.59 I1016 15:20:18.657303 1 prometheus.go:67] 10.58 I1016 15:20:19.657458 1 prometheus.go:67] 11.54

mhenriks commented 4 years ago

I do not think it is a retry issue. It is been running all day trying with multiple Fedora qcow2 images and none has been imported successfully.

The issue is that download.fedoraproject.org is redirecting to a URL that that is returning 404. From the log above:

qemu-img: curl: The requested URL returned error: 404 Not Found

The only way to deal with that is by retrying.

It may seem remarkable that you are having success with wget while CDI fails. But unless there is evidence of CDI not retrying or CDI fails without 404 Not Found in the log, I'm not sure what we can do here.

alosadagrande commented 4 years ago

Understand that, I cannot say there is an issue with CDI, it can be in GCP, Anyway, I was looking for some ideas to look at.

On the other side, if you take a look at the output here the Fedora qcow2 image is being downloaded, at least it reached 1% before returning 404. So there is at least an established connection. What is weird, and that's why opened the issue is why is it not happening with raw images?

mhenriks commented 4 years ago

@alosadagrande in this case (importing qcow2 http file) qemu-img is converting the file to raw on the fly so it will open multiple connections and seeks to arbitrary offsets. So it is possible that some requests get redirected to bad urls and some don't. If you want to dig into these failures a little more closely, maybe take CDI out of the picture and run the following locally.

qemu-img info https://download.fedoraproject.org/pub/fedora/linux/releases/30/Cloud/x86_64/images/Fedora-Cloud-Base-30-1.2.x86_64.qcow2

qemu-img convert -p -O raw https://download.fedoraproject.org/pub/fedora/linux/releases/30/Cloud/x86_64/images/Fedora-Cloud-Base-30-1.2.x86_64.qcow2 fedora.raw

Interestingly, I get the following error when running locally:

➜ ~ qemu-img info https://download.fedoraproject.org/pub/fedora/linux/releases/30/Cloud/x86_64/images/Fedora-Cloud-Base-30-1.2.x86_64.qcow2 qemu-img: Could not open 'https://download.fedoraproject.org/pub/fedora/linux/releases/30/Cloud/x86_64/images/Fedora-Cloud-Base-30-1.2.x86_64.qcow2': CURL: Error opening file: Server does not support 'range' (byte ranges).

alosadagrande commented 4 years ago

@mhenriks thanks for the information.

In my case I had never seen the 'range' not supported issue, which btw explains why the image cannot be imported.

qemu-img info https://download.fedoraproject.org/pub/fedora/linux/releases/30/Cloud/x86_64/images/Fedora-Cloud-Base-30-1.2.x86_64.qcow2 image: json:{"driver": "qcow2", "file": {"url": "https://download.fedoraproject.org/pub/fedora/linux/releases/30/Cloud/x86_64/images/Fedora-Cloud-Base-30-1.2.x86_64.qcow2", "driver": "https"}} file format: qcow2 virtual size: 4.0G (4294967296 bytes) disk size: unavailable cluster_size: 65536 Format specific information: compat: 0.10 refcount bits: 16

Since this is a configuration that must be addressed at Fedora SDN site, should not be getting the same issue with raw images as well? (it is not the case)

mhenriks commented 4 years ago

Since this is a configuration that must be addressed at Fedora SDN site, should not be getting the same issue with raw images as well? (it is not the case)

I cannot comment on what may or may not be happening once a request hits the fedora site (what/how requests get redirected) but as I noted earlier, qemu-img has different access patterns when accessing qcow2 vs raw.

aglitke commented 4 years ago

I don't think this can be resolved in CDI. Qemu is a partial http implementation and may not support more advanced features of the HTTP spec. The solution would be to mirror your desired image elsewhere. I'm going to close this issue per the above.

maya-r commented 4 years ago

The 404 looks like a mirror-specific issue. http://repo.ialab.dsu.edu/fedora/linux/releases/30/Cloud/x86_64/images/ is missing some files that are available on http://spout.ussg.indiana.edu/linux/fedora/linux/releases/30/Cloud/x86_64/images/ I'll try to contact fedora admins about this mirror.