Closed jrafanie closed 4 years ago
Maybe this should use a local unix socket? Thoughts?
Files with Coverage Reduction | New Missed Lines | % | ||
---|---|---|---|---|
app/models/miq_ae_datastore.rb | 1 | 71.08% | ||
lib/miq_automation_engine/engine/miq_ae_engine/miq_ae_method.rb | 1 | 79.38% | ||
lib/miq_automation_engine/engine/miq_ae_engine/miq_ae_object.rb | 1 | 90.99% | ||
app/models/miq_ae_yaml_import.rb | 4 | 97.29% | ||
<!-- | Total: | 7 | --> |
Totals | |
---|---|
Change from base Build 3798: | 0.02% |
Covered Lines: | 5121 |
Relevant Lines: | 5960 |
Totals | |
---|---|
Change from base Build 3798: | 0.02% |
Covered Lines: | 5072 |
Relevant Lines: | 5903 |
@jrafanie As discussed, should we just use a unix socket instead?
Also, FWIW, on my machine here's what I get...also using verizon fios dns at home...
require "drb"
drb = DRb.start_service
drb.uri
# => "druby://jfrey-mac.fios-router.home:56755"
That resolves to a 192.168.1.### address.
(EDIT: Discussed offline with @jrafanie and it turns out my modern router actually bridges the 5GHz and 2Ghz ranges to the ethernet and is given an internal IP. Either way it's using a non-local IP.)
Ok, I've updated the code to use a unix socket and updated the description. I've tested this by going to Services -> Catalog, choosing "Order" for a service. Prior to this code change, I would get timeout errors. With this code change, I can get the dropdown populated.
@jrafanie I dont think we can use static names, since there could be multiple instances of the engine running on a single appliance, each one would need a dedicated connection name.
@jrafanie Plus you have methods that can invoke other methods, each method would need a dedicated connection name.
@jrafanie Plus you have methods that can invoke other methods, each method would need a dedicated connection name.
@mkanoor see the in-line comment above, I think it addresses your concern
One second... I just got this while testing David's original issue:
[----] E, [2020-03-13T15:40:35.536582 #77390:3fdaf18cd3e8] ERROR -- : Method STDERR: The following error occurred during inline method preamble evaluation:
[----] E, [2020-03-13T15:40:35.536745 #77390:3fdaf18cd3e8] ERROR -- : Method STDERR: ArgumentError: too long unix socket path (111bytes given but 104bytes max)
π
One second... I just got this while testing David's original issue:
[----] E, [2020-03-13T15:40:35.536582 #77390:3fdaf18cd3e8] ERROR -- : Method STDERR: The following error occurred during inline method preamble evaluation: [----] E, [2020-03-13T15:40:35.536745 #77390:3fdaf18cd3e8] ERROR -- : Method STDERR: ArgumentError: too long unix socket path (111bytes given but 104bytes max)
Ok, I shrunk the name to automation_client
to match the length of the existing automation_engine
used for the server.
I tested this a few more times and haven't seen any errors.
@miq-bot rm_label wip
@miq-bot add_label jansa/yes?
We can avoid this entirely by
@jrafanie I was thinking the right answer was going to be switching from verizon. But I guess this approach works as well.
Dropped these changes on an appliance I used yesterday for provisioning Services and VMs. I ran simultaneous service and vm provisioning as well as retirement. Everything looks good.
Thanks.
Jansa backport details:
$ git log -1
commit 8026eaa398a345b10167619a4dfc138fde624013
Author: Greg McCullough <gmccullo@redhat.com>
Date: Wed Mar 18 19:47:29 2020 -0400
Merge pull request #431 from jrafanie/force_client_drb_connection_locally
Explicitly pass a local URI for our client's DRb.start_service
(cherry picked from commit 3b71688b4eff4f6ba2741168372cc14dbd541a98)
While testing some workspace instantiantion at home, I was seeing this in the logs:
Note, the drbunix:///var/folders... unix socket being reported on one side but the druby://92.242... TCP socket reporting a timeout. Weird.
Using verizon fios dns at home, somehow DRb resolves DRb.start_service to a remote DRb service, which I believe happens below because we're not passing a URI so we have no hostname from this URI and it tries to resolve the local hostname: https://github.com/ruby/ruby/blob/v2_6_5/lib/drb/drb.rb#L879-L884
Note, this is the IP address of verizon's DNS assistance program, as mentioned here: https://askubuntu.com/questions/587895/why-is-the-ip-92-242-140-21-connecting-to-one-of-my-ports-is-it-malware#comment1583793_587954
We can avoid this entirely by explicitly telling DRb to use a local URI for our DRb client:
We've seen this weird thing in the past where our server was using unix sockets and the client was failing, trying to access a TCP DRb server. The never merged solution in the past was to remove the TCP socket option from the server but I believe the problem might actually have been what you see above, in the DRb client, which also starts a drb server to talk to the unix socket drb server.
https://github.com/ManageIQ/manageiq-automation_engine/pull/234
Thanks @agrare with helping me debug this.