Closed mcowger closed 4 years ago
I believe the real reason this is failing is that we don't support SO_DEFER_ACCEPT.
I1219 22:53:14.348108 2791 x:0] [ 3] python E setsockopt(0x5 socket:[3], 0x1, 0x9, 0x7faf9cffb600, 0x4)
I1219 22:53:14.348127 2791 x:0] [ 3] python X setsockopt(0x5 socket:[3], 0x1, 0x9, 0x7faf9cffb600, 0x4) = 0x0 (7.473µs)
I1219 22:53:14.348185 2791 x:0] [ 3] python E getsockopt(0x5 socket:[3], 0x6, 0x9, 0x7faf9cffb3c0, 0x7faf9cffb3c4)
I1219 22:53:14.348265 2791 x:0] Unsupported syscall: getsockopt, regs: amd64:
I1219 22:53:14.348725 2791 x:0] [ 3] python E close(0x5 socket:[3]) I1219 22:53:14.348834 2791 x:0] [ 3] python X close(0x5 socket:[3]) = 0x0 (
That said its rather strange that its trying to set SO_DEFER_ACCEPT on a connecting socket and not a listening one.
Just stumbled on it as well with Celery. Do I understand it right that this syscall is simply not implemented, i.e. it's not a matter of a missing capability that could be granted to the container? Thanks!
SO_DEFER_ACCEPT is now supported. But thats likely not the reason why it's failing. Could you try again and if it fails provide a full strace log and repro steps.
Actually nm I see the steps above. Let me try then and see .
Not sure if that gives any insight on the issue, but I'm getting the exact same warning with my Elixir application. The application only serves a GraphQL endpoint and I get several lines of "Unsupported syscall getsockopt" for every API request.
Hey guys, I'm just fronting the same problem. Any light on it ?
Apologies this fell of my radar. Let me spend sometime to look at this right now.
It doesn't look like an issue with running the image. It fails to connect and from the error it seems like its failing to resolve the amqp:// address. Let me take a look
docker run --runtime=runsc-d-p 8080:8080 celery Unable to find image 'celery:latest' locally latest: Pulling from library/celery ef0380f84d05: Pull complete ada810c79ed7: Pull complete 4608a1c4fe47: Pull complete 58086cbb21fb: Pull complete a7bccb4a3faa: Pull complete 9de06a08ec25: Pull complete ad6feb8c6a6b: Pull complete 7568ca85d492: Pull complete 2d6f458f7411: Pull complete Digest: sha256:5c236059192a0389a2be21fc42d8db59411d953b7af5457faf501d4eec32dc31 Status: Downloaded newer image for celery:latest [2020-05-06 15:39:34,841: WARNING/MainProcess] /usr/local/lib/python3.5/site-packages/celery/apps/worker.py:161: CDeprecationWarning: Starting from version 3.2 Celery will refuse to accept pickle by default.
The pickle serializer is a security concern as it may give attackers the ability to execute any command. It's important to secure your broker from unauthorized access when using pickle, so we think that enabling pickle should require a deliberate action and not be the default choice.
If you depend on pickle then you should set a setting to disable this warning and to be sure that everything will continue working when you upgrade to Celery 3.2::
CELERY_ACCEPT_CONTENT = ['pickle', 'json', 'msgpack', 'yaml']
You must only enable the serializers that you will actually use.
warnings.warn(CDeprecationWarning(W_PICKLE_DEPRECATED)) [2020-05-06 15:39:36,647: ERROR/MainProcess] consumer: Cannot connect to amqp://guest:**@rabbit:5672//: [Errno -2] Name or service not known. Trying again in 2.00 seconds...
[2020-05-06 15:39:38,857: ERROR/MainProcess] consumer: Cannot connect to amqp://guest:**@rabbit:5672//: [Errno -2] Name or service not known. Trying again in 4.00 seconds...
[2020-05-06 15:39:43,046: ERROR/MainProcess] consumer: Cannot connect to amqp://guest:**@rabbit:5672//: [Errno -2] Name or service not known. Trying again in 6.00 seconds...
Please note that the issue you are seeing with name resolution is not the one I reported. Name resolution within the container is not the challenge we are seeing with the failing syscall.
On Wed, May 6, 2020 at 8:41 AM Bhasker Hariharan notifications@github.com wrote:
It doesn't look like an issue with running the image. It fails to connect and from the error it seems like its failing to resolve the amqp:// address. Let me take a look
docker run --runtime=runsc-d-p 8080:8080 celery Unable to find image 'celery:latest' locally latest: Pulling from library/celery ef0380f84d05: Pull complete ada810c79ed7: Pull complete 4608a1c4fe47: Pull complete 58086cbb21fb: Pull complete a7bccb4a3faa: Pull complete 9de06a08ec25: Pull complete ad6feb8c6a6b: Pull complete 7568ca85d492: Pull complete 2d6f458f7411: Pull complete Digest: sha256:5c236059192a0389a2be21fc42d8db59411d953b7af5457faf501d4eec32dc31 Status: Downloaded newer image for celery:latest [2020-05-06 15:39:34,841: WARNING/MainProcess] /usr/local/lib/python3.5/site-packages/celery/apps/worker.py:161: CDeprecationWarning: Starting from version 3.2 Celery will refuse to accept pickle by default.
The pickle serializer is a security concern as it may give attackers the ability to execute any command. It's important to secure your broker from unauthorized access when using pickle, so we think that enabling pickle should require a deliberate action and not be the default choice.
If you depend on pickle then you should set a setting to disable this warning and to be sure that everything will continue working when you upgrade to Celery 3.2::
CELERY_ACCEPT_CONTENT = ['pickle', 'json', 'msgpack', 'yaml']
You must only enable the serializers that you will actually use.
warnings.warn(CDeprecationWarning(W_PICKLE_DEPRECATED)) [2020-05-06 15:39:36,647: ERROR/MainProcess] consumer: Cannot connect to amqp://guest:**@rabbit https://github.com/rabbit:5672//: [Errno -2] Name or service not known. Trying again in 2.00 seconds...
[2020-05-06 15:39:38,857: ERROR/MainProcess] consumer: Cannot connect to amqp://guest:**@rabbit https://github.com/rabbit:5672//: [Errno -2] Name or service not known. Trying again in 4.00 seconds...
[2020-05-06 15:39:43,046: ERROR/MainProcess] consumer: Cannot connect to amqp://guest:**@rabbit https://github.com/rabbit:5672//: [Errno -2] Name or service not known. Trying again in 6.00 seconds...
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/google/gvisor/issues/1441#issuecomment-624725552, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAOXCTHK2VKZXKDDJU2SCMTRQGAKDANCNFSM4J5SIQCQ .
-- -- Matt
I started up a rabbitmq server locally and it works. Here are the commands
docker run -d --hostname localhost --name some-rabbit rabbitmq:3
docker run --runtime=runsc --link some-rabbit:rabbit -p 8080:8080 celery [2020-05-06 15:50:19,235: WARNING/MainProcess] /usr/local/lib/python3.5/site-packages/celery/apps/worker.py:161: CDeprecationWarning: Starting from version 3.2 Celery will refuse to accept pickle by default.
The pickle serializer is a security concern as it may give attackers the ability to execute any command. It's important to secure your broker from unauthorized access when using pickle, so we think that enabling pickle should require a deliberate action and not be the default choice.
If you depend on pickle then you should set a setting to disable this warning and to be sure that everything will continue working when you upgrade to Celery 3.2::
CELERY_ACCEPT_CONTENT = ['pickle', 'json', 'msgpack', 'yaml']
You must only enable the serializers that you will actually use.
warnings.warn(CDeprecationWarning(W_PICKLE_DEPRECATED)) [2020-05-06 15:50:21,965: WARNING/MainProcess] celery@1e91a5dba68f ready.
@mcowger hmm I scanned the logs for the repro I tried and I don't see celery making that call at all. Could you provide a bit more details on your rabbitMQ setup and how you are connecting to it.
If you aren’t actually submitting jobs to the AMQP server via celery (like the reproduction instructions above) then the test is not complete. The issue is not running celery under docker, the issue to issuing AMQP message to celery from python running under runc. There is a clear sample code to reproduce in the gist linked above.
On Wed, May 6, 2020 at 8:52 AM Bhasker Hariharan notifications@github.com wrote:
I started up a rabbitmq server locally and it works. Here are the commands
docker run -d --hostname localhost --name some-rabbit rabbitmq:3
docker run --runtime=runsc --link some-rabbit:rabbit -p 8080:8080 celery [2020-05-06 15:50:19,235: WARNING/MainProcess] /usr/local/lib/python3.5/site-packages/celery/apps/worker.py:161: CDeprecationWarning: Starting from version 3.2 Celery will refuse to accept pickle by default.
The pickle serializer is a security concern as it may give attackers the ability to execute any command. It's important to secure your broker from unauthorized access when using pickle, so we think that enabling pickle should require a deliberate action and not be the default choice.
If you depend on pickle then you should set a setting to disable this warning and to be sure that everything will continue working when you upgrade to Celery 3.2::
CELERY_ACCEPT_CONTENT = ['pickle', 'json', 'msgpack', 'yaml']
You must only enable the serializers that you will actually use.
warnings.warn(CDeprecationWarning(W_PICKLE_DEPRECATED)) [2020-05-06 15:50:21,965: WARNING/MainProcess] celery@1e91a5dba68f ready.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/google/gvisor/issues/1441#issuecomment-624731880, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAOXCTEIZCQWTO3WZ4VR5RTRQGBSDANCNFSM4J5SIQCQ .
-- -- Matt
Ah apologies I read the report again so its not celery that's failing but your python application that is connecting to celery?
Let me try that sorry about that.
I0506 16:21:29.306868 175117 strace.go:628] [ 3] python X getsockopt(0x5 socket:[19], SOL_TCP, TCP_DEFER_ACCEPT, 0x7fc1f83fb3c0 {value=0}, 0x7fc1f83fb3c4 {length=4}) = 0x0 (227.514µs) I0506 16:21:29.307040 175117 strace.go:590] [ 3] python E getsockopt(0x5 socket:[19], SOL_TCP, TCP_SYNCNT, 0x7fc1f83fb3c0, 0x7fc1f83fb3c4 {length=4}) I0506 16:21:29.307097 175117 strace.go:628] [ 3] python X getsockopt(0x5 socket:[19], SOL_TCP, TCP_SYNCNT, 0x7fc1f83fb3c0, 0x7fc1f83fb3c4 {length=4}) = 0x0 errno=92 (protocol not available) (20.743µs) I0506 16:21:29.307268 175117 strace.go:578] [ 3] python E close(0x5 socket:[19])
Looks like its now not failing on TCP_DEFER_ACCEPT but on another socket option that we don't support yet. TCP_SYNCNT.
TCP_SYNCNT (since Linux 2.4) Set the number of SYN retransmits that TCP should send before aborting the attempt to connect. It cannot exceed 255. This option should not be used in code intended to be portable.
Netstack doesn't control retries using a count of SYN's but its just bounded by time. Let me stub out this option for now so that it doesn't return an error and see if it makes progress.
Strange now I stubbed that out and it flunks in another getsockopt TCP_WINDOW_CLAMP but now I don't see it making the TCP_SYNCNT call.
I0506 16:41:55.278268 190228 compat.go:122] Unsupported syscall: getsockopt, regs: amd64:{rax:18446744073709551578 rbx:139660612445664 rcx:139660652557770 rdx:10 rsi:6 rdi:5 rsp:139660623852456 rbp:139660623852476 r8:139660623852484 r9:39 r10:139660623852480 r11:659 r12:139660637795321 r13:139660659080256 r14:139660658854592 r15:139660525944904 rip:139660652557770 rflags:659 orig_rax:55 cs:51 ss:43 fs_base:139660623873792}
I0506 16:41:55.279419 190228 strace.go:628] [ 3] python X getsockopt(0x5 socket:[3], SOL_TCP, TCP_WINDOW_CLAMP, 0x7f0545dfb3c0, 0x7f0545dfb3c4 {length=4}) = 0x0 errno=92 (protocol not available) (1.16589ms)
Okay stubbing those two out seems to have fixed it. I now see packets going back and forth w/ AMQP (that said the curl request does not return.) But that seems unrelated to a gvisor issue and probably just that something about my local setup of rabbitmq+ celery is probably not correct.
I see packets like this going back and forth.
5345 3394.254142701 192.168.9.2 → 192.168.9.3 TCP 66 5672 → 38964 [ACK] Seq=3702026781 Ack=3323880087 Win=64128 Len=0 TSval=1949979946 TSecr=3707238452
5346 3394.254496590 192.168.9.3 → 192.168.9.2 AMQP 150 Content-Header type=application/json
5347 3394.254532797 192.168.9.2 → 192.168.9.3 TCP 66 5672 → 38964 [ACK] Seq=3702026781 Ack=3323880171 Win=64128 Len=0 TSval=1949979946 TSecr=3707238452
5348 3394.254842587 192.168.9.3 → 192.168.9.2 AMQP 344 Content-Body (application/json)
5349 3394.254872180 192.168.9.2 → 192.168.9.3 TCP 66 5672 → 38964 [ACK] Seq=3702026781 Ack=3323880449 Win=64128 Len=0 TSval=1949979947 TSecr=3707238453
5350 3394.255804983 192.168.9.2 → 192.168.9.3 AMQP 481 Basic.Deliver x=celeryev rk=worker.heartbeat Content-Header type=application/json Content-Body (application/json)
5351 3394.256146501 192.168.9.3 → 192.168.9.2 TCP 66 38963 → 5672 [ACK] Seq=586704009 Ack=669080124 Win=1048064 Len=0 TSval=865816751 TSecr=1949979948
^[[5~ 5352 3396.259715829 192.168.9.3 → 192.168.9.2 AMQP 107 Basic.Publish x=celeryev rk=worker.heartbeat
5353 3396.259788466 192.168.9.2 → 192.168.9.3 TCP 66 5672 → 38964 [ACK] Seq=3702026781 Ack=3323880490 Win=64128 Len=0 TSval=1949981952 TSecr=3707240458
5354 3396.260151838 192.168.9.3 → 192.168.9.2 AMQP 150 Content-Header type=application/json
5355 3396.260184177 192.168.9.2 → 192.168.9.3 TCP 66 5672 → 38964 [ACK] Seq=3702026781 Ack=3323880574 Win=64128 Len=0 TSval=1949981952 TSecr=3707240458
5356 3396.260505283 192.168.9.3 → 192.168.9.2 AMQP 344 Content-Body (application/json)
5357 3396.260533305 192.168.9.2 → 192.168.9.3 TCP 66 5672 → 38964 [ACK] Seq=3702026781 Ack=3323880852 Win=64128 Len=0 TSval=1949981952 TSecr=3707240458
5358 3396.261005551 192.168.9.2 → 192.168.9.3 AMQP 481 Basic.Deliver x=celeryev rk=worker.heartbeat Content-Header type=application/json Content-Body (application/json)
5359 3396.261265557 192.168.9.3 → 192.168.9.2 TCP 66 38963 → 5672 [ACK] Seq=586704009 Ack=669080539 Win=1048064 Len=0 TSval=865818756 TSecr=1949981953
5360 3397.354087208 02:42:bd:d6:cd:ca → 02:42:c0:a8:09:03 ARP 42 Who has 192.168.9.3? Tell 192.168.9.1
5361 3397.354499185 02:42:c0:a8:09:03 → 02:42:bd:d6:cd:ca ARP 42 192.168.9.3 is at 02:42:c0:a8:09:03
5362 3398.265989317 192.168.9.3 → 192.168.9.2 AMQP 107 Basic.Publish x=celeryev rk=worker.heartbeat
5363 3398.266076361 192.168.9.2 → 192.168.9.3 TCP 66 5672 → 38964 [ACK] Seq=3702026781 Ack=3323880893 Win=64128 Len=0 TSval=1949983958 TSecr=3707242464
5364 3398.266507185 192.168.9.3 → 192.168.9.2 AMQP 150 Content-Header type=application/json
5365 3398.266549194 192.168.9.2 → 192.168.9.3 TCP 66 5672 → 38964 [ACK] Seq=3702026781 Ack=3323880977 Win=64128 Len=0 TSval=1949983958 TSecr=3707242464
5366 3398.266895787 192.168.9.3 → 192.168.9.2 AMQP 344 Content-Body (application/json)
5367 3398.266927073 192.168.9.2 → 192.168.9.3 TCP 66 5672 → 38964 [ACK] Seq=3702026781 Ack=3323881255 Win=64128 Len=0 TSval=1949983959 TSecr=3707242465
5368 3398.267581418 192.168.9.2 → 192.168.9.3 AMQP 481 Basic.Deliver x=celeryev rk=worker.heartbeat Content-Header type=application/json Content-Body (application/json)
5369 3398.267906279 192.168.9.3 → 192.168.9.2 TCP 66 38963 → 5672 [ACK] Seq=586704009 Ack=669080954 Win=1048064 Len=0 TSval=865820763 TSecr=1949983959
5370 3400.273521509 192.168.9.3 → 192.168.9.2 AMQP 107 Basic.Publish x=celeryev rk=worker.heartbeat
5371 3400.273586881 192.168.9.2 → 192.168.9.3 TCP 66 5672 → 38964 [ACK] Seq=3702026781 Ack=3323881296 Win=64128 Len=0 TSval=1949985965 TSecr=3707244471
5372 3400.274033527 192.168.9.3 → 192.168.9.2 AMQP 150 Content-Header type=application/json
5373 3400.274110673 192.168.9.2 → 192.168.9.3 TCP 66 5672 → 38964 [ACK] Seq=3702026781 Ack=3323881380 Win=64128 Len=0 TSval=1949985966 TSecr=3707244472
5374 3400.274504739 192.168.9.3 → 192.168.9.2 AMQP 344 Content-Body (application/json)
5375 3400.274553376 192.168.9.2 → 192.168.9.3 TCP 66 5672 → 38964 [ACK] Seq=3702026781 Ack=3323881658 Win=64128 Len=0 TSval=1949985966 TSecr=3707244472
5376 3400.275106224 192.168.9.2 → 192.168.9.3 AMQP 481 Basic.Deliver x=celeryev rk=worker.heartbeat Content-Header type=application/json Content-Body (application/json)
5377 3400.275385423 192.168.9.3 → 192.168.9.2 TCP 66 38963 → 5672 [ACK] Seq=586704009 Ack=669081369 Win=1048064 Len=0 TSval=865822770 TSecr=1949985967
5378 3402.279475689 192.168.9.3 → 192.168.9.2 AMQP 107 Basic.Publish x=celeryev rk=worker.heartbeat
5379 3402.279529683 192.168.9.2 → 192.168.9.3 TCP 66 5672 → 38964 [ACK] Seq=3702026781 Ack=3323881699 Win=64128 Len=0 TSval=1949987971 TSecr=3707246477
5380 3402.279822370 192.168.9.3 → 192.168.9.2 AMQP 150 Content-Header type=application/json
5381 3402.279856663 192.168.9.2 → 192.168.9.3 TCP 66 5672 → 38964 [ACK] Seq=3702026781 Ack=3323881783 Win=64128 Len=0 TSval=1949987972 TSecr=3707246478
5382 3402.280145915 192.168.9.3 → 192.168.9.2 AMQP 342 Content-Body (application/json)
5383 3402.280185891 192.168.9.2 → 192.168.9.3 TCP 66 5672 → 38964 [ACK] Seq=3702026781 Ack=3323882059 Win=64128 Len=0 TSval=1949987972 TSecr=3707246478
5384 3402.280600908 192.168.9.2 → 192.168.9.3 AMQP 479 Basic.Deliver x=celeryev rk=worker.heartbeat Content-Header type=application/json Content-Body (application/json)
5385 3402.280888065 192.168.9.3 → 192.168.9.2 TCP 66 38963 → 5672 [ACK] Seq=586704009 Ack=669081782 Win=1048064 Len=0 TSval=865824776 TSecr=1949987972
5386 3404.285404506 192.168.9.3 → 192.168.9.2 AMQP 107 Basic.Publish x=celeryev rk=worker.heartbeat
5387 3404.285473204 192.168.9.2 → 192.168.9.3 TCP 66 5672 → 38964 [ACK] Seq=3702026781 Ack=3323882100 Win=64128 Len=0 TSval=1949989977 TSecr=3707248483
5388 3404.285878425 192.168.9.3 → 192.168.9.2 AMQP 150 Content-Header type=application/json
5389 3404.285916112 192.168.9.2 → 192.168.9.3 TCP 66 5672 → 38964 [ACK] Seq=3702026781 Ack=3323882184 Win=64128 Len=0 TSval=1949989978 TSecr=3707248484
5390 3404.286373040 192.168.9.3 → 192.168.9.2 AMQP 344 Content-Body (application/json)
5391 3404.286409072 192.168.9.2 → 192.168.9.3 TCP 66 5672 → 38964 [ACK] Seq=3702026781 Ack=3323882462 Win=64128 Len=0 TSval=1949989978 TSecr=3707248484
5392 3404.287043351 192.168.9.2 → 192.168.9.3 AMQP 481 Basic.Deliver x=celeryev rk=worker.heartbeat Content-Header type=application/json Content-Body (application/json)
5393 3404.287336718 192.168.9.3 → 192.168.9.2 TCP 66 38963 → 5672 [ACK] Seq=586704009 Ack=669082197 Win=1048064 Len=0 TSval=865826782 TSecr=1949989979
5394 3406.291692821 192.168.9.3 → 192.168.9.2 AMQP 107 Basic.Publish x=celeryev rk=worker.heartbeat
5395 3406.291758482 192.168.9.2 → 192.168.9.3 TCP 66 5672 → 38964 [ACK] Seq=3702026781 Ack=3323882503 Win=64128 Len=0 TSval=1949991984 TSecr=3707250490
5396 3406.292168290 192.168.9.3 → 192.168.9.2 AMQP 150 Content-Header type=application/json
5397 3406.292199229 192.168.9.2 → 192.168.9.3 TCP 66 5672 → 38964 [ACK] Seq=3702026781 Ack=3323882587 Win=64128 Len=0 TSval=1949991984 TSecr=3707250490
5398 3406.292491836 192.168.9.3 → 192.168.9.2 AMQP 344 Content-Body (application/json)
5399 3406.292518760 192.168.9.2 → 192.168.9.3 TCP 66 5672 → 38964 [ACK] Seq=3702026781 Ack=3323882865 Win=64128 Len=0 TSval=1949991984 TSecr=3707250490
5400 3406.293005423 192.168.9.2 → 192.168.9.3 AMQP 481 Basic.Deliver x=celeryev rk=worker.heartbeat Content-Header type=application/json Content-Body (application/json)
5401 3406.293379361 192.168.9.3 → 192.168.9.2 TCP 66 38963 → 5672 [ACK] Seq=586704009 Ack=669082612 Win=1048064 Len=0 TSval=865828788 TSecr=1949991985
I will open two bugs to add TCP_SYNCNT and TCP_WINDOW_CLAMP. I think we can probably start by just stubbing them out to unblock this and then look at implementing it properly.
I0506 17:04:01.417799 211953 strace.go:584] [ 3] python E socket(AF_INET, SOCK_STREAM|SOCK_CLOEXEC, IPPROTO_TCP)
I0506 17:04:01.417898 211953 strace.go:622] [ 3] python X socket(AF_INET, SOCK_STREAM|SOCK_CLOEXEC, IPPROTO_TCP) = 0x9 (47.271µs)
I0506 17:04:01.418085 211953 strace.go:584] [ 3] python E fcntl(0x9 socket:[7], 0x1, 0x0)
I0506 17:04:01.418135 211953 strace.go:622] [ 3] python X fcntl(0x9 socket:[7], 0x1, 0x0) = 0x1 (3.13µs)
I0506 17:04:01.418280 211953 strace.go:584] [ 3] python E fcntl(0x9 socket:[7], 0x2, 0x1)
I0506 17:04:01.418314 211953 strace.go:622] [ 3] python X fcntl(0x9 socket:[7], 0x2, 0x1) = 0x0 (3.584µs)
I0506 17:04:01.418429 211953 strace.go:584] [ 3] python E ioctl(0x9 socket:[7], 0x5421, 0x7fa6e69fb64c)
I0506 17:04:01.418468 211953 strace.go:622] [ 3] python X ioctl(0x9 socket:[7], 0x5421, 0x7fa6e69fb64c) = 0x0 (8.852µs)
I0506 17:04:01.418609 211953 strace.go:584] [ 3] python E connect(0x9 socket:[7], 0x7fa6e69fb5f0 {Family: AF_INET, Addr: 192.168.9.1, Port: 8080}, 0x10)
I0506 17:04:01.418696 211953 strace.go:622] [ 3] python X connect(0x9 socket:[7], 0x7fa6e69fb5f0 {Family: AF_INET, Addr: 192.168.9.1, Port: 8080}, 0x10) = 0x0 errno=115 (operation now in progress) (52
.179µs)
I0506 17:04:01.418810 211953 sniffer.go:448] send tcp 192.168.9.4:0 -> 192.168.9.1:0 len:40 id:0000 invalid packet: tcp data offset 40 larger than packet buffer length 20
I0506 17:04:01.418932 211953 strace.go:584] [ 3] python E poll(0x7fa6e69fb4f8 [{FD: 0x9 socket:[7], Events: POLLOUT|POLLERR, REvents: ...}], 0x1, 0xfa0)
I0506 17:04:01.419036 211953 sniffer.go:448] recv tcp 192.168.9.1:0 -> 192.168.9.4:0 len:40 id:0000 invalid packet: tcp data offset 40 larger than packet buffer length 20
I0506 17:04:01.419271 211953 sniffer.go:448] send tcp 192.168.9.4:0 -> 192.168.9.1:0 len:32 id:0000 invalid packet: tcp data offset 32 larger than packet buffer length 20 gso: &{Type:1 NeedsCsum:true Cs
umOffset:16 MSS:1420 L3HdrLen:20 MaxSize:65536}
I0506 17:04:01.419371 211953 strace.go:622] [ 3] python X poll(0x7fa6e69fb4f8 [{FD: 0x9 socket:[7], Events: POLLOUT|POLLERR|POLLHUP, REvents: POLLOUT}], 0x1, 0xfa0) = 0x1 (389.53µs)
I0506 17:04:01.419469 211953 strace.go:590] [ 3] python E getsockopt(0x9 socket:[7], SOL_SOCKET, SO_ERROR, 0x7fa6e69fb4f8, 0x7fa6e69fb4fc {length=4})
I0506 17:04:01.419515 211953 strace.go:628] [ 3] python X getsockopt(0x9 socket:[7], SOL_SOCKET, SO_ERROR, 0x7fa6e69fb4f8 {value=0}, 0x7fa6e69fb4fc {length=4}) = 0x0 (10.349µs)
I0506 17:04:01.419626 211953 strace.go:584] [ 3] python E ioctl(0x9 socket:[7], 0x5421, 0x7fa6e69fb64c)
I0506 17:04:01.419667 211953 strace.go:622] [ 3] python X ioctl(0x9 socket:[7], 0x5421, 0x7fa6e69fb64c) = 0x0 (5.978µs)
I0506 17:04:01.419778 211953 strace.go:590] [ 3] python E setsockopt(0x9 socket:[7], SOL_SOCKET, SO_KEEPALIVE, 0x7fa6e69fb600 {value=1}, 0x4)
I0506 17:04:01.419844 211953 strace.go:628] [ 3] python X setsockopt(0x9 socket:[7], SOL_SOCKET, SO_KEEPALIVE, 0x7fa6e69fb600, 0x4) = 0x0 (18.73µs)
I0506 17:04:01.419962 211953 strace.go:590] [ 3] python E getsockopt(0x9 socket:[7], SOL_TCP, TCP_CORK, 0x7fa6e69fb3c0, 0x7fa6e69fb3c4 {length=4})
I0506 17:04:01.420003 211953 strace.go:628] [ 3] python X getsockopt(0x9 socket:[7], SOL_TCP, TCP_CORK, 0x7fa6e69fb3c0 {value=0}, 0x7fa6e69fb3c4 {length=4}) = 0x0 (6.128µs)
I0506 17:04:01.420121 211953 strace.go:590] [ 3] python E getsockopt(0x9 socket:[7], SOL_TCP, TCP_LINGER2, 0x7fa6e69fb3c0, 0x7fa6e69fb3c4 {length=4})
I0506 17:04:01.420160 211953 strace.go:628] [ 3] python X getsockopt(0x9 socket:[7], SOL_TCP, TCP_LINGER2, 0x7fa6e69fb3c0 {value=60}, 0x7fa6e69fb3c4 {length=4}) = 0x0 (5.798µs)
I0506 17:04:01.420252 211953 strace.go:590] [ 3] python E getsockopt(0x9 socket:[7], SOL_TCP, TCP_SYNCNT, 0x7fa6e69fb3c0, 0x7fa6e69fb3c4 {length=4})
I0506 17:04:01.420260 211953 sniffer.go:448] recv tcp 192.168.9.1:0 -> 192.168.9.4:0 len:32 id:ab60 invalid packet: tcp data offset 32 larger than packet buffer length 20
I0506 17:04:01.420290 211953 strace.go:628] [ 3] python X getsockopt(0x9 socket:[7], SOL_TCP, TCP_SYNCNT, 0x7fa6e69fb3c0 {value=15}, 0x7fa6e69fb3c4 {length=4}) = 0x0 (4.663µs)
I0506 17:04:01.420419 211953 sniffer.go:448] send tcp 192.168.9.4:0 -> 192.168.9.1:0 len:32 id:0000 invalid packet: tcp data offset 32 larger than packet buffer length 20 gso: &{Type:1 NeedsCsum:true CsumOffset:16 MSS:1420 L3HdrLen:20 MaxSize:65536}
I0506 17:04:01.420492 211953 strace.go:590] [ 3] python E getsockopt(0x9 socket:[7], SOL_TCP, TCP_QUICKACK, 0x7fa6e69fb3c0, 0x7fa6e69fb3c4 {length=4})
I0506 17:04:01.420562 211953 strace.go:628] [ 3] python X getsockopt(0x9 socket:[7], SOL_TCP, TCP_QUICKACK, 0x7fa6e69fb3c0 {value=1}, 0x7fa6e69fb3c4 {length=4}) = 0x0 (32.187µs)
I0506 17:04:01.420655 211953 strace.go:590] [ 3] python E getsockopt(0x9 socket:[7], SOL_TCP, TCP_DEFER_ACCEPT, 0x7fa6e69fb3c0, 0x7fa6e69fb3c4 {length=4})
I0506 17:04:01.420710 211953 strace.go:628] [ 3] python X getsockopt(0x9 socket:[7], SOL_TCP, TCP_DEFER_ACCEPT, 0x7fa6e69fb3c0 {value=0}, 0x7fa6e69fb3c4 {length=4}) = 0x0 (5.901µs)
I0506 17:04:01.420799 211953 strace.go:590] [ 3] python E getsockopt(0x9 socket:[7], SOL_TCP, TCP_WINDOW_CLAMP, 0x7fa6e69fb3c0, 0x7fa6e69fb3c4 {length=4})
I0506 17:04:01.420837 211953 strace.go:628] [ 3] python X getsockopt(0x9 socket:[7], SOL_TCP, TCP_WINDOW_CLAMP, 0x7fa6e69fb3c0 {value=1048576}, 0x7fa6e69fb3c4 {length=4}) = 0x0 (4.57µs)
I0506 17:04:01.420928 211953 strace.go:590] [ 3] python E getsockopt(0x9 socket:[7], SOL_TCP, TCP_MAXSEG, 0x7fa6e69fb3c0, 0x7fa6e69fb3c4 {length=4})
I0506 17:04:01.420965 211953 strace.go:628] [ 3] python X getsockopt(0x9 socket:[7], SOL_TCP, TCP_MAXSEG, 0x7fa6e69fb3c0 {value=536}, 0x7fa6e69fb3c4 {length=4}) = 0x0 (4.75µs)
I0506 17:04:01.421058 211953 strace.go:590] [ 3] python E setsockopt(0x9 socket:[7], SOL_TCP, TCP_KEEPINTVL, 0x7fa6e69fb4b0 {value=10}, 0x4)
I0506 17:04:01.421106 211953 strace.go:628] [ 3] python X setsockopt(0x9 socket:[7], SOL_TCP, TCP_KEEPINTVL, 0x7fa6e69fb4b0, 0x4) = 0x0 (18.864µs)
I0506 17:04:01.421189 211953 strace.go:590] [ 3] python E setsockopt(0x9 socket:[7], SOL_TCP, TCP_CORK, 0x7fa6e69fb4b0 {value=0}, 0x4)
I0506 17:04:01.421236 211953 strace.go:628] [ 3] python X setsockopt(0x9 socket:[7], SOL_TCP, TCP_CORK, 0x7fa6e69fb4b0, 0x4) = 0x0 (17.548µs)
:I0506 17:04:01.421321 211953 strace.go:590] [ 3] python E setsockopt(0x9 socket:[7], SOL_TCP, TCP_USER_TIMEOUT, 0x7fa6e69fb4b0 {value=1000}, 0x4)
I0506 17:04:01.421354 211953 strace.go:628] [ 3] python X setsockopt(0x9 socket:[7], SOL_TCP, TCP_USER_TIMEOUT, 0x7fa6e69fb4b0, 0x4) = 0x0 (4.277µs)
I0506 17:04:01.421437 211953 strace.go:590] [ 3] python E setsockopt(0x9 socket:[7], SOL_TCP, TCP_NODELAY, 0x7fa6e69fb4b0 {value=1}, 0x4)
I0506 17:04:01.421481 211953 strace.go:628] [ 3] python X setsockopt(0x9 socket:[7], SOL_TCP, TCP_NODELAY, 0x7fa6e69fb4b0, 0x4) = 0x0 (15.492µs)
I0506 17:04:01.421579 211953 strace.go:590] [ 3] python E setsockopt(0x9 socket:[7], SOL_TCP, TCP_KEEPCNT, 0x7fa6e69fb4b0 {value=9}, 0x4)
I0506 17:04:01.421625 211953 strace.go:628] [ 3] python X setsockopt(0x9 socket:[7], SOL_TCP, TCP_KEEPCNT, 0x7fa6e69fb4b0, 0x4) = 0x0 (17.197µs)
I0506 17:04:01.421709 211953 strace.go:590] [ 3] python E setsockopt(0x9 socket:[7], SOL_TCP, TCP_LINGER2, 0x7fa6e69fb4b0 {value=60}, 0x4)
I0506 17:04:01.421744 211953 strace.go:628] [ 3] python X setsockopt(0x9 socket:[7], SOL_TCP, TCP_LINGER2, 0x7fa6e69fb4b0, 0x4) = 0x0 (6.304µs)
I0506 17:04:01.421828 211953 strace.go:590] [ 3] python E setsockopt(0x9 socket:[7], SOL_TCP, TCP_SYNCNT, 0x7fa6e69fb4b0 {value=15}, 0x4)
I0506 17:04:01.421865 211953 strace.go:628] [ 3] python X setsockopt(0x9 socket:[7], SOL_TCP, TCP_SYNCNT, 0x7fa6e69fb4b0, 0x4) = 0x0 (9.2µs)
I0506 17:04:01.421948 211953 strace.go:590] [ 3] python E setsockopt(0x9 socket:[7], SOL_TCP, TCP_KEEPIDLE, 0x7fa6e69fb4b0 {value=60}, 0x4)
I0506 17:04:01.421992 211953 strace.go:628] [ 3] python X setsockopt(0x9 socket:[7], SOL_TCP, TCP_KEEPIDLE, 0x7fa6e69fb4b0, 0x4) = 0x0 (15.499µs)
I0506 17:04:01.422076 211953 strace.go:590] [ 3] python E setsockopt(0x9 socket:[7], SOL_TCP, TCP_QUICKACK, 0x7fa6e69fb4b0 {value=1}, 0x4)
I0506 17:04:01.422107 211953 strace.go:628] [ 3] python X setsockopt(0x9 socket:[7], SOL_TCP, TCP_QUICKACK, 0x7fa6e69fb4b0, 0x4) = 0x0 (3.412µs)
I0506 17:04:01.422190 211953 strace.go:590] [ 3] python E setsockopt(0x9 socket:[7], SOL_TCP, TCP_DEFER_ACCEPT, 0x7fa6e69fb4b0 {value=0}, 0x4)
I0506 17:04:01.422222 211953 strace.go:628] [ 3] python X setsockopt(0x9 socket:[7], SOL_TCP, TCP_DEFER_ACCEPT, 0x7fa6e69fb4b0, 0x4) = 0x0 (3.713µs)
I0506 17:04:01.422304 211953 strace.go:590] [ 3] python E setsockopt(0x9 socket:[7], SOL_TCP, TCP_WINDOW_CLAMP, 0x7fa6e69fb4b0 {value=1048576}, 0x4)
I0506 17:04:01.422341 211953 strace.go:628] [ 3] python X setsockopt(0x9 socket:[7], SOL_TCP, TCP_WINDOW_CLAMP, 0x7fa6e69fb4b0, 0x4) = 0x0 (9.229µs)
I0506 17:04:01.422424 211953 strace.go:590] [ 3] python E setsockopt(0x9 socket:[7], SOL_TCP, TCP_MAXSEG, 0x7fa6e69fb4b0 {value=536}, 0x4)
I0506 17:04:01.422467 211953 strace.go:628] [ 3] python X setsockopt(0x9 socket:[7], SOL_TCP, TCP_MAXSEG, 0x7fa6e69fb4b0, 0x4) = 0x0 (14.656µs)
that said the curl request does not return.
Unless celery is setup with workers (which it isn't by default) thats expected behaviorm, as there's no workers to complete the task request. I agree that the curl request isn't relevant to this issue, and that if you've get messages going in and gettng accepted then its probably good to go.
@prattmic pointed the following out, Looks like AMQP queries TCP options to figure out which ones are supported on what platform so it just sort of skips the setsockopt if a particular one is not supported.
https://github.com/celery/py-amqp/blob/ccbe683cfd30aef75118223088984f7c9ace43e4/amqp/transport.py#L215 https://github.com/celery/py-amqp/blob/ccbe683cfd30aef75118223088984f7c9ace43e4/amqp/transport.py#L195 https://github.com/celery/py-amqp/blob/79833d3e261ad89638ad41c997b31f29a61eb6bf/amqp/platform.py#L37
I believe this should now work. @mcowger Could you try and update the bug?
Unfortunately I dont have the toolset setup/required to build gvisor from source, so I'll just accept your results. thanks!
May we know approximately when this fix could be available in GKE Sandbox? Thanks!
Typically GKE sandbox should get the update in the next 2-3 weeks.
Typically GKE sandbox should get the update in the next 2-3 weeks.
Does this timeline apply to Cloud Run as well?
Typically yes but please note these are estimates, actual rollout times do vary a bit. We gVisor team have no real control over the rollout processes and CloudRun and GKE teams decide when they pick up a particular gVisor version based on the stability of their corresponding systems.
Thanks, a much appreciated insight!
When running a simple application to submit jobs via AMQP to a Celery server, gvisor prevents the sockets from being opened to send the data. Debug logs are attached.
Base simple application that listens via HTTP and makes request, along with python requirements and Dockerfile: https://gist.github.com/mcowger/7d4ab07a75dc1ddddd1f1fb20dc5d8fc
When running this compiled docker image under regular containerd, it runs fine. When run under the runsc runtime, it fails. When run under Google Cloud Run (where I actually first encountered the issue), the error is easier to see (though gvisor tries to disclaim responsibility):
Container Sandbox: Unsupported syscall getsockopt(0x5,0x6,0xa,0x3ee2d2dfb3c0,0x3ee2d2dfb3c4,0x0). It is very likely that you can safely ignore this message and that this is not the cause of any error you might be troubleshooting. Please, refer to https://gvisor.dev/c/linux/amd64/getsockopt for more information.
Other useful information to include is:
runsc -v
runsc version release-20191213.0 spec: 1.0.1-dev
docker version
ordocker info
if more relevantServer: Docker Engine - Community Engine: Version: 19.03.5 API version: 1.40 (minimum version 1.12) Go version: go1.12.12 Git commit: 633a0ea838 Built: Wed Nov 13 07:24:29 2019 OS/Arch: linux/amd64 Experimental: false containerd: Version: 1.2.10 GitCommit: b34a5c8af56e510852c35414db4c1f4fa6172339 runc: Version: 1.0.0-rc8+dev GitCommit: 3e425f80a8c931f88e6d94a8c831b9d5aa481657 docker-init: Version: 0.18.0 GitCommit: fec3683
uname -a
-git describe
Linux devbox 4.9.0-11-amd64 #1 SMP Debian 4.9.189-3+deb9u2 (2019-11-11) x86_64 GNU/Linux
runsc.bug.zip