vespa-engine / vespa

AI + Data, online. https://vespa.ai
https://vespa.ai
Apache License 2.0
5.58k stars 586 forks source link

PaxHeader files in app package tarball fails deployment #17837

Closed AdamBelive closed 2 years ago

AdamBelive commented 3 years ago

image While doing the quick start tutorial, I did it following the guide and it work. Then, I opened the files to see what were in them (album-recommendation-selfhosted) and retried the quick start guide but an error appeared. As you can see in this screenshot, ".root.xml" is trying to be parsed but the prefix . is causing the parser to throw.

kkraune commented 3 years ago

Work notes:

https://github.com/vespa-engine/sample-apps/blob/master/album-recommendation-selfhosted/src/main/application/search/query-profiles/types/root.xml has the correct name.

I then followed the steps in https://docs.vespa.ai/en/vespa-quick-start.html - all worked fine

Can you please check this?

$ ls -la src/main/application/search/query-profiles/types/

Should output something like

total 8 drwxr-xr-x 3 kraune staff 96 May 12 17:22 . drwxr-xr-x 4 kraune staff 128 May 12 17:22 .. -rw-r--r-- 1 kraune staff 614 May 12 17:22 root.xml

AdamBelive commented 3 years ago

Here's the output:

drwxr-xr-x 3 ADAM 1000 96 12 May 12:54 . drwxr-xr-x 5 ADAM 1000 160 12 May 12:56 .. -rw-r--r--@ 1 ADAM 1000 614 12 May 12:54 root.xml

I believe it's the same

kkraune commented 3 years ago

interesting! so either there is something wrong with the tar / gzip combo, or something breaks on the vespa size when unzipping. I hope you can help by just creating the tar file / untar it and see - and maybe zip the tar file and make it available for our download?

AdamBelive commented 3 years ago

here the zip tar file : application.tar.gz

kkraune commented 3 years ago

thanks! the archive looks right. looking at options ...

kkraune commented 3 years ago

I am able to replicate:

$ curl --header Content-Type:application/x-gzip --data-binary @/Users/kraune/Downloads/application.tar.gz localhost:19071/application/v2/tenant/default/prepareandactivate

{"error-code":"INVALID_APPLICATION_PACKAGE","message":"Invalid application package: default.default: Error loading model: Could not parse '._root.xml', error at line 1, column 1: Content is not allowed in prolog."}

just curious: can you try using zip instead of gzip (ref https://cloud.vespa.ai/en/getting-started for how to create the zip) - and deploy with Content-Type:application/zip ?

kkraune commented 3 years ago

CC @jonmv / @hmusum https://docs.vespa.ai/en/cloudconfig/deploy-rest-api-v2.html

AdamBelive commented 3 years ago

image I think that's what you wanted me to do?

kkraune commented 3 years ago

not really, you have to create the .zip file as in the cloud guide and POST it like

curl --header Content-Type:application/zip --data-binary @application.zip localhost:19071/application/v2/tenant/default/prepareandactivate

AdamBelive commented 3 years ago

it worked!

AdamBelive commented 3 years ago

Thanks for the help!

kkraune commented 3 years ago

great news! I will keep this ticket open for my own investigation to compare the tar files. Thanks for reporting!

kkraune commented 3 years ago

diff-ing your and my tar files:

$ diff --text err/e.tar ok/o.tar 
1c1
< ./000755 000766 000024 00000000000 14051477055 011654 5ustar00kraunestaff000000 000000 ./hosts.xml000644 000766 000024 00000000623 14047004143 013524 0ustar00kraunestaff000000 000000 <?xml version="1.0" encoding="utf-8" ?>
---
> ./000755 000766 000024 00000000000 14051476363 011655 5ustar00kraunestaff000000 000000 ./hosts.xml000644 000766 000024 00000000623 14051476363 013540 0ustar00kraunestaff000000 000000 <?xml version="1.0" encoding="utf-8" ?>
17c17
< ./services.xml000644 000766 000024 00000004652 14047004143 014215 0ustar00kraunestaff000000 000000 <?xml version="1.0" encoding="utf-8" ?>
---
> ./services.xml000644 000766 000024 00000004652 14051476363 014231 0ustar00kraunestaff000000 000000 <?xml version="1.0" encoding="utf-8" ?>
81c81
< ./schemas/000755 000766 000024 00000000000 14047004143 013264 5ustar00kraunestaff000000 000000 ./search/000755 000766 000024 00000000000 14051476221 013113 5ustar00kraunestaff000000 000000 ./search/query-profiles/000755 000766 000024 00000000000 14051476237 016110 5ustar00kraunestaff000000 000000 ./search/query-profiles/types/000755 000766 000024 00000000000 14047004143 017240 5ustar00kraunestaff000000 000000 ./search/query-profiles/default.xml000644 000766 000024 00000000753 14047004143 020247 0ustar00kraunestaff000000 000000 <!-- Copyright Verizon Media. Licensed under the terms of the Apache 2.0 license. See LICENSE in the project root. -->
---
> ./schemas/000755 000766 000024 00000000000 14051476363 013300 5ustar00kraunestaff000000 000000 ./search/000755 000766 000024 00000000000 14051476363 013122 5ustar00kraunestaff000000 000000 ./search/query-profiles/000755 000766 000024 00000000000 14051476363 016110 5ustar00kraunestaff000000 000000 ./search/query-profiles/types/000755 000766 000024 00000000000 14051476363 017254 5ustar00kraunestaff000000 000000 ./search/query-profiles/default.xml000644 000766 000024 00000000753 14051476363 020263 0ustar00kraunestaff000000 000000 <!-- Copyright Verizon Media. Licensed under the terms of the Apache 2.0 license. See LICENSE in the project root. -->
93,94c93
< ./search/query-profiles/types/._root.xml000644 000766 000024 00000000260 14047004143 021160 0ustar00kraunestaff000000 000000 Mac OS X         2~?ATTR???com.apple.lastuseddate#PS?`ă?./search/query-profiles/types/PaxHeader/root.xml000644 000766 000024 00000000035 14047004143 022714 xustar00kraunestaff000000 000000 29 mtime=1620838499.50353989
< ./search/query-profiles/types/root.xml000644 000766 000024 00000001146 14047004143 020747 0ustar00kraunestaff000000 000000 <!-- Copyright Verizon Media. Licensed under the terms of the Apache 2.0 license. See LICENSE in the project root. -->
---
> ./search/query-profiles/types/root.xml000644 000766 000024 00000001146 14051476363 020763 0ustar00kraunestaff000000 000000 <!-- Copyright Verizon Media. Licensed under the terms of the Apache 2.0 license. See LICENSE in the project root. -->
107c106
< ./schemas/music.sd000644 000766 000024 00000002264 14047004143 014740 0ustar00kraunestaff000000 000000 # Copyright Verizon Media. Licensed under the terms of the Apache 2.0 license. See LICENSE in the project root.
---
> ./schemas/music.sd000644 000766 000024 00000002264 14051476363 014754 0ustar00kraunestaff000000 000000 # Copyright Verizon Media. Licensed under the terms of the Apache 2.0 license. See LICENSE in the project root.
148c147
< 
\ No newline at end of file
---
> 
\ No newline at end of file
kkraune commented 3 years ago

I am not the tar expert, but this seems to be a problem with a PaxHeader file - I don't yet fully understand why one tar file got it and not the other, but Vespa should extract the tar correctly regardless, @jonmv

kkraune commented 3 years ago

https://stackoverflow.com/questions/34688392/paxheaders-in-tarball

kkraune commented 3 years ago

I ran into this again myself:

$ tar -C my-app -cf - . | gzip | curl --header Content-Type:application/x-gzip --data-binary @- localhost:19071/application/v2/tenant/default/prepareandactivate {"error-code":"INVALID_APPLICATION_PACKAGE","message":"Invalid application package: Error loading default.default: Could not parse sd file '._news.sd': Unknown symbol: Lexical error at line -1, column 2. Encountered: \"\u0000\" (0), after : \"\tar -C my-app -cf - . | gzip | curl --header Content-Type:application/x-gzip --data-binary @- localhost:19071/application/v2/tenant/default/prepareandactivate

$ tar -C my-app -cf - . > tar.out

$ cat tar.out

./schemas/._news.sd000644 000766 000024 00000000260 14055634522 015014 0ustar00kraunestaff000000 000000 Mac OS X 2~?ATTR???com.apple.lastuseddate#PS?8?`F?./schemas/PaxHeader/news.sd000644 000766 000024 00000000036 14055634522 016551 xustar00kraunestaff000000 000000 30 mtime=1622620498.545294193 ./schemas/news.sd000644 000766 000024 00000002116 14055634522 014601 0ustar00kraunestaff000000 000000 schema news { document news {

$ tar --version bsdtar 3.3.2 - libarchive 3.3.2 zlib/1.2.11 liblzma/5.0.5 bz2lib/1.0.6

kkraune commented 3 years ago

review some docs in https://github.com/vespa-engine/documentation/pull/1326

kkraune commented 2 years ago

https://docs.vespa.ai/en/reference/application-packages-reference.html#deploy is updated a long time ago, use zip, not tar/gzip - and I could not find any examples of tar/gzip use in doc/sample apps - closing