fusioninventory / fusioninventory-agent

FusionInventory Agent
http://fusioninventory.org/
GNU General Public License v2.0
254 stars 126 forks source link

If it does not communicate with the server, it will stop responding and will not recover forever. #915

Closed masakazuwatanabe closed 2 months ago

masakazuwatanabe commented 3 years ago

Hello. I have installed Fusion Inventory Agent in Windows 10 environment and am trying to manage it with GLPI, but I am having trouble with problems.

When sending information to the GLPI server, it seems that it will not be recovered if it stops in the middle of sending because it can not communicate.

In my case, it seems to occur when the value of "server" contains a value that cannot be retrieved by DNS.(Http will not respond as it is, and even if communication is restored, it will not be restored)

[Mon May 17 12:19:45 2021] [info] sending prolog request to server0

When I tried to map the IP and host name in hosts, the following was output to the log and it ended with an error correctly.

[Mon May 17 12:19:02 2021] [error] [http client] communication error: 500 Can't connect to example.com: 443
[Mon May 17 12:19:02 2021] [error] No answer from server at https://example.com/plugins/fusioninventory/

I am currently working from home due to the covid-19 problem.Therefore, I connect my home and office with VPN.

In my environment, OpenVPN pushes the DNS server of the office when connecting to the office.The GLPI server to which the Agent data is sent also performs name resolution in the DNS of this office.

Since I work from home, my computer often goes to sleep.Communication is also disconnected each time. Therefore, this phenomenon always occurs every day, and it will not be restored until it is restarted.

Given the covid-19 problem now, some people may run into similar problems.

g-bougard commented 3 years ago

Hi @masakazuwatanabe I think you're misunderstanding few things. The agent is a HTTP Client for the GLPI server. If DNS is not working, the agent won't normally be able to join the server. This is a network issue, not an agent issue. Your few log lines export are showing "example.com" and I hope you're not using this dns name (and you should have told us you're not). And this lines only reports the agent can't join the server.

Did you even tried to open a browser to access your GLPI server ? Here we have openvpn too and we don't have such issues.

Anyway, the agent on communication errors will retry soon, but it will double the time it will retry after on new communication error until a maximum so don't expect it retry as soon as your vpn is up. If you close your vpn before the agent tries to contact the server, this will fail again. This is by design as from the agent point of view, we don't have any clue on why we can't contact the server.

To be sure the agent is just missing the opened vpn window, you may want to enable debug 2 level to check if there's not a SSL issue.

masakazuwatanabe commented 3 years ago

sorry. example.com has been written as an example. The actual example.com part contains the proper hostname I'm actually using.

I understand that it is a communication problem. However, if it cannot communicate with DNS, the agent will not execute any further processing and will be stopped, and it will not be retried.

At this time, the log output of debug2 is stopped in the following state, and it seems that the processing does not cause an error after that.

[Mon May 17 16:24:49 2021] [info] sending prolog request to server0
[Mon May 17 16:24:49 2021] [debug2] [http client] sending message:
 <? xml version = "1.0" encoding = "UTF-8"?>
<REQUEST>
  <DEVICEID> device-2021-05-14-12-42-06 </ DEVICEID>
  <QUERY> PROLOG </ QUERY>
  <TOKEN> 12345678 </ TOKEN>
</ REQUEST>

Also, at this time, even if I access "localhost: 62354" with the Windows 10 browser that contains the Agent, no response is returned and nothing is displayed on the screen.

If you get an error and try again, it's okay because the update will be executed correctly someday ... Is it possible to generate an error in such a case due to the mechanism?

g-bougard commented 3 years ago

... the agent will not execute any further processing and will be stopped, and it will not be retried.

Do you mean the service is really stopped ? Did you check the service manager ?

Also, at this time, even if I access "localhost: 62354" with the Windows 10 browser that contains the Agent, no response is returned and nothing is displayed on the screen.

This still can be a firewall issue but this can occur if the agent has crashed (in that case the service should also have been set to "stopped" in service manager) or is blocked for any reason. But honestly, you must provide us more informations to understand what is really happening in your case as you would be the first to encounter such problem and a lot of people are using the agent on windows 10 without such problem.

One thing could help in the case your issue makes the agent reaching a perl error case. Check to find the fusioninventory-win32-service.rc.sample file into the perl\bin subfolder of your agent installation folder. Rename this file to just remove the .sample last part. Make sure you have a logs subfolder into your agent installation folder. And then restart the service. You should see appearing 2 files (stderr.txt and stdout.txt) into the logs folder. Then try again the process : while not connected to your vpn, access the localhost:62354 interface and click on "run now", check all the log file (including stderr.txt and stdout.txt), if everything seems ok, then connect to your vpn and try to "run now" again. Report any error found in stderr.txt or stdout.txt files.

masakazuwatanabe commented 3 years ago

Do you mean the service is really stopped ? Did you check the service manager ?

The status of Fusion Inventory Agent on the service manager is running.

One thing could help in the case your issue makes the agent reaching a perl error case. Check to find the fusioninventory-win32-service.rc.sample file into the perl\bin subfolder of your agent installation folder. Rename this file to just remove the .sample last part. Make sure you have a logs subfolder into your agent installation folder. And then restart the service. You should see appearing 2 files (stderr.txt and stdout.txt) into the logs folder.

I tried to run it.

test

Case not connected to VPN

Then try again the process : while not connected to your vpn, access the localhost:62354 interface and click on "run now", check all the log file (including stderr.txt and stdout.txt), if everything seems ok,

When not connected to VPN, DNS is not pushed from VPN, so it is the case that the name of the destination GLPI server cannot be resolved, and this problem occurs.

fusioninventory-agent.log

if it cannot communicate with DNS, the agent will not execute any further processing and will be stopped, and it will not be retried.

[Mon May 17 17:37:37 2021][debug2] [http client] sending message:
<?xml version="1.0" encoding="UTF-8" ?>
<REQUEST>
<DEVICEID>device-2021-05-14-12-42-06</DEVICEID>
<QUERY>PROLOG</QUERY>
<TOKEN>12345678</TOKEN>
</REQUEST>

Service Manager

localhost: 62354

Also, at this time, even if I access "localhost: 62354" with the Windows 10 browser that contains the Agent, no response is returned and nothing is displayed on the screen.

stdout.txt

Mon May 17 17:36:19 2021: BEGIN stdout.txt

stderr.txt

There is an error in the place that looks like log output, but I think this is because my environment is Japanese.

Mon May 17 17:36:19 2021: BEGIN stderr.txt
Wide character in print at C:\Program Files\FusionInventory-Agent/perl/agent/FusionInventory/Agent/Logger/File.pm line 58.
Wide character in print at C:\Program Files\FusionInventory-Agent/perl/agent/FusionInventory/Agent/Logger/File.pm line 58.
Wide character in print at C:\Program Files\FusionInventory-Agent/perl/agent/FusionInventory/Agent/Logger/File.pm line 58.

Case connected to VPN

if everything seems ok, then connect to your vpn and try to "run now" again. Report any error found in stderr.txt or stdout.txt files.

If you are connected to a VPN, I can resolve the name of the GLPI server, so it works fine.

The following is a case where you can connect to the VPN and restart the Fusion Inventory Agent without any problems.

fusioninventory-agent.log

No problem.

Service Manager

stdout.txt

Mon May 17 17:54:08 2021: BEGIN stdout.txt

stderr.txt

There is an error in the place that looks like log output, but I think this is because my environment is Japanese.

Mon May 17 17:54:08 2021: BEGIN stderr.txt
Wide character in print at C:\Program Files\FusionInventory-Agent/perl/agent/FusionInventory/Agent/Logger/File.pm line 58.
Wide character in print at C:\Program Files\FusionInventory-Agent/perl/agent/FusionInventory/Agent/Logger/File.pm line 58.
Wide character in print at C:\Program Files\FusionInventory-Agent/perl/agent/FusionInventory/Agent/Logger/File.pm line 58.
g-bougard commented 3 years ago

Yes, the "Wide character in print" error is not critical and related to your environment. As the service is still running, we don't have a crash issue.

Also, at this time, even if I access "localhost: 62354" with the Windows 10 browser that contains the Agent, no response is returned and nothing is displayed on the screen.

Here I'm scarred by "Windows 10 browser", did you try another browser FF or Chrome ? But I'm also scarred by "localhost: 62354"... by the space in the middle indeed. Are you really trying this URL: http://localhost:62354/ ? And you didn't tell if you managed to reach this URL at any moment so nothing is telling me you don't have a firewall issue and so I still don't see why you're thinking the agent is "blocked".

masakazuwatanabe commented 3 years ago

Here I'm scarred by "Windows 10 browser", did you try another browser FF or Chrome ?

Chrome, FF, Edge are all in the same state.

But I'm also scarred by "localhost: 62354"... by the space in the middle indeed. Are you really trying this URL:

yes.

The first time you access it, "Force an Inventory" will be displayed and I can press it.

On the next screen after I press

OK
Back

Is displayed.

I pressed "Back" at this time, but the browser still doesn't seem to respond.

And you didn't tell if you managed to reach this URL at any moment so nothing is telling me you don't have a firewall issue and so I still don't see why you're thinking the agent is "blocked".

sorry. In this state, the log output was stopped as mentioned above, so I guessed that way.

[Mon May 17 17:37:37 2021][debug2] [http client] sending message:
 <?xml version="1.0" encoding="UTF-8" ?>
<REQUEST>
  <DEVICEID>device-2021-05-14-12-42-06</DEVICEID>
  <QUERY>PROLOG</QUERY>
  <TOKEN>12345678</TOKEN>
</REQUEST>
masakazuwatanabe commented 3 years ago

It's a Japanese environment.

Chrome

chrome

FF

firefox

Edge

edge

masakazuwatanabe commented 3 years ago

There is a display saying CONNECTION REFUSED.

g-bougard commented 3 years ago

Okay, thanks, that's clear enough now ;-) You can't assume the agent is blocked because no more line is coming in the log. But yes, it seems to be blocked if the service is running and you obtains a "connection refused" on the local http interface.

Then do you see something in the stderr.txt file when the agent is blocked and you stop the service (don't restart the service before checking the stderr.txt file or you may miss something) ?

masakazuwatanabe commented 3 years ago

stop the service

stderr.txt

Mon May 17 18:52:45 2021: END stderr.txt
Perl exited with active threads:
    1 running and unjoined
    0 finished and unjoined
    1 running and detached

stdout.txt

Mon May 17 18:50:42 2021: BEGIN stdout.txt
Mon May 17 18:52:45 2021: END stdout.txt

fusioninventory-agent.log

There is no additional output.

g-bougard commented 3 years ago

Well, this means the agent thread is probably really blocked in some way. Now I need to reproduce myself but this could be really hard to reproduce your setup. Can you tell me which openvpn client and version you're using ? It's still a chance I still using openvpn here.

As a work-around, you should re-install the agent but using a planned task setup that can guaranty it should run at a moment you're connected to the vpn. For example, running the agent every 30 minutes or every hour can be sufficient. In that case, don't expect to have an access on the local HTTP interface, this one is only available when the agent is running as a service.

masakazuwatanabe commented 3 years ago

OpenVPN Client

version

https://openvpn.net/community-downloads/
openvpn-install-2.4.9-I601-Win10.exe
g-bougard commented 3 years ago

Hi @masakazuwatanabe I made a test on a laptop. The agent starts, I connect the vpn to our OpenVPN with 2.4.9 client. I force the inventory and everything runs as expected. I leave the laptop going in sleep mode and then I wake up the laptop. Here the vpn is connecting automatically and I still can force the inventory and it runs as expected. I tried also with different scenarios of not connected VPN. I can see errors as I can't access GLPI but every thing goes as expected after I start the vpn. So I can't reproduce with the informations you gave.

One question, which tasks are enabled on your agent ? Can you confirm you're trying with FusionInventory Agent v2.6 ? Did you try with more recent OpenVPN client ? 2.4.9 is still one year old, 2.4.11 and even 2.5.1 are available.

masakazuwatanabe commented 3 years ago

Hi @g-bougard

Answer

which tasks are enabled on your agent?

Some tasks on the GLPI side are enabled.

  1. Package deploy Check the registry, copy exe and bat, run bat. Since it has already been executed on the target client, it has finished with the registry check.
  2. Network Discovery I'm using an Agent that resides on another Linux server
  3. Network Inventory (SNMP) I'm using an Agent that resides on another Linux server

I tried after disabling all the tasks, but the result is the same.

Can you confirm you're trying with FusionInventory Agent v2.6?

It is v2.6. The GLPI display when the update is successful also shows v2.6.

Useragent Fusion Inventory-Agent_v2.6

Did you try with more recent OpenVPN client? 2.4.9 is still one year old, 2.4.11 and even 2.5.1 are available.

Uninstall the existing one. I reinstalled OpenVPN-2.5.2-I601-amd64.msi and tried. But the result was the same.

Other

Value pushed by OpenVPN server

Route and DNS are pushed from the VPN server.

2021-06-16 00:00:40 PUSH: Received control message:
  'PUSH_REPLY, route 10.0.0.0 255.0.0.0, route 172.16.0.0 255.255.0.0, route 192.168.0.0 255.255.0.0,
  dhcp-option DNS 192.168.2.11, dhcp-option DNS 192.168.2.12,
  route-gateway 172.16.0.2, ping 10, ping-restart 120, ifconfig 172.16.0.148 255.255.240.0, peer-id 0, cipher AES-256-GCM

Agent settings

Note that the registry value of my client agent (The host name is masked with an asterisk. It is actually the correct host name. However, it is a host name that can only be resolved by DNS on the intranet pushed from the VPN server.)

Windows Registry Editor Version 5.00

[HKEY_LOCAL_MACHINE\SOFTWARE\FusionInventory-Agent]
"backend-collect-timeout"="180"
"ca-cert-dir"=""
"ca-cert-file"=""
"conf-reload-interval"="0"
"debug"="2"
"delaytime"="3600"
"html"="0"
"httpd-ip"="0.0.0.0"
"httpd-port"="62354"
"httpd-trust"="127.0.0.1/32,10.130.8.138/32"
"local"="C:\\FusionInventory Agent.xml"
"logfile"="C:\\Program Files\\FusionInventory-Agent\\logs\\fusioninventory-agent.log"
"logfile-maxsize"="16"
"logger"="File"
"no-category"=""
"no-httpd"="0"
"no-p2p"="0"
"no-ssl-check"="1"
"no-task"=""
"password"=""
"proxy"=""
"server"="https://glpi01.*.********.**.jp/plugins/fusioninventory/"
"scan-homedirs"="0"
"scan-profiles"="0"
"tag"=""
"tasks"=""
"timeout"="180"
"user"=""

Agent Log

[Mon May 17 17:37:37 2021][debug2] [http client] sending message:
 <?xml version="1.0" encoding="UTF-8" ?>
<REQUEST>
  <DEVICEID>device-2021-05-14-12-42-06</DEVICEID>
  <QUERY>PROLOG</QUERY>
  <TOKEN>12345678</TOKEN>
</REQUEST>

In this problem, when the agent log stops outputting at the following part.

g-bougard commented 3 years ago

Other

Value pushed by OpenVPN server

Route and DNS are pushed from the VPN server.

2021-06-16 00:00:40 PUSH: Received control message:
  'PUSH_REPLY, route 10.0.0.0 255.0.0.0, route 172.16.0.0 255.255.0.0, route 192.168.0.0 255.255.0.0,
  dhcp-option DNS 192.168.2.11, dhcp-option DNS 192.168.2.12,
  route-gateway 172.16.0.2, ping 10, ping-restart 120, ifconfig 172.16.0.148 255.255.240.0, peer-id 0, cipher AES-256-GCM

I have mostly the same kind of routing and dns pushing. I see few differences but I don't think this changes anything:

  • I only have one route and I'm sure this changes nothing in your problem
  • between route-gateway & ping 10 I have topology subnet option

Agent Log

[Mon May 17 17:37:37 2021][debug2] [http client] sending message:
 <?xml version="1.0" encoding="UTF-8" ?>
<REQUEST>
  <DEVICEID>device-2021-05-14-12-42-06</DEVICEID>
  <QUERY>PROLOG</QUERY>
  <TOKEN>12345678</TOKEN>
</REQUEST>

Does this also happen when you're not connected to the VPN and you just restart the fusioninventory-agent service ? (Eventually by forcing the run via the http interface)

On my side I have the following sequence:

[Tue Jun 15 15:22:56 2021][info] sending prolog request to server0
[Tue Jun 15 15:22:56 2021][debug2] [http client] sending message:
 <?xml version="1.0" encoding="UTF-8" ?>
<REQUEST>
  <DEVICEID>pc-fusioninventory-2021-06-15-15-07-03</DEVICEID>
  <QUERY>PROLOG</QUERY>
  <TOKEN>12345678</TOKEN>
</REQUEST>
[Tue Jun 15 15:22:56 2021][error] [http client] communication error: 500 Can't connect to xxx.teclib.xxx (Unknown host. )
[Tue Jun 15 15:22:56 2021][error] No answer from server at http://xxx.teclib.xxx/plugins/fusioninventory

Then when not connected to your vpn, can you also try from an administrative console and from the agent installation folder to run:

fusioninventory-agent.bat --debug --debug --logger=stderr --server=https://glpi01.*.********.**.jp/plugins/fusioninventory/

The inventory must immediately fail with an unknown host error.

masakazuwatanabe commented 3 years ago

Does this also happen when you're not connected to the VPN and you just restart the fusioninventory-agent service ? (Eventually by forcing the run via the http interface)

The same state even if the Fusion Inventory-agent service is restarted.

  1. Cause a problem. (The browser is trying to communicate and the communication does not end.) (Log output is also stopped in the same state)
  2. When I restart the fusioininventory-agent service, the browser communication attempt ends. The screen is displayed.
  3. When I press the reload button on the browser immediately, the problem is occurring and the browser is trying to communicate and the communication does not end. (Log output is also stopped in the same state)

Then when not connected to your vpn, can you also try from an administrative console and from the agent installation folder to run: The inventory must immediately fail with an unknown host error.

It is the same when executed by a command. The contents of the command screen are stopped at the same location as the log. I can't get to the unknown host error.

g-bougard commented 3 years ago

Well, what to say ? It seems your network stack is blocking the agent for some reason when not connected to the vpn. Do you finally obtain an error after a while on the command line run if the case some network timeout are occuring ? Can you report the ipconfig /all output when not connected to the vpn ? (obfuscating any sensible data) To check if you have any other exotic network config. Then have you tried to reinstall the agent, just in case something goes wrong since the installation ? Finally, can you give glpi-agent a try from a nightly build as it provides a little more recent perl build ? (Stop fusioninventory-agent service or even uninstall fusioninventory-agent before using glpi-agent)

masakazuwatanabe commented 3 years ago

sorry for the late reply.

This time I did it after uninstalling all of ESET Antivirus and OpenVPN. The state did not change.

Do you finally obtain an error after a while on the command line run if the case some network timeout are occurring?

The state does not change even after 1 hour or more. Does not time out.

Can you report the ipconfig / all output when not connected to the vpn? (Obfuscation any sensible data) To check if you have any other exotic network config.

(The host name is masked with an asterisk.)

Active code page: 437

C:\Users\watanabe.DOMAIN.001>ipconfig /all

Windows IP Configuration

   Host Name . . . . . . . . . . . . : WATANABE-T480s
   Primary Dns Suffix  . . . . . . . : *****.***.jp
   Node Type . . . . . . . . . . . . : Hybrid
   IP Routing Enabled. . . . . . . . : No
   WINS Proxy Enabled. . . . . . . . : No
   DNS Suffix Search List. . . . . . : *****.***.jp
                                       flets-east.jp
                                       iptvf.jp

Ethernet adapter イーサネット:

   Media State . . . . . . . . . . . : Media disconnected
   Connection-specific DNS Suffix  . :
   Description . . . . . . . . . . . : Intel(R) Ethernet Connection (4) I219-V
   Physical Address. . . . . . . . . : E8-6A-64-49-83-24
   DHCP Enabled. . . . . . . . . . . : Yes
   Autoconfiguration Enabled . . . . : Yes

Wireless LAN adapter ローカル エリア接続* 3:

   Media State . . . . . . . . . . . : Media disconnected
   Connection-specific DNS Suffix  . :
   Description . . . . . . . . . . . : Microsoft Wi-Fi Direct Virtual Adapter #3
   Physical Address. . . . . . . . . : 50-76-AF-46-C1-E4
   DHCP Enabled. . . . . . . . . . . : Yes
   Autoconfiguration Enabled . . . . : Yes

Wireless LAN adapter ローカル エリア接続* 12:

   Media State . . . . . . . . . . . : Media disconnected
   Connection-specific DNS Suffix  . :
   Description . . . . . . . . . . . : Microsoft Wi-Fi Direct Virtual Adapter #4
   Physical Address. . . . . . . . . : 52-76-AF-46-C1-E3
   DHCP Enabled. . . . . . . . . . . : No
   Autoconfiguration Enabled . . . . : Yes

Wireless LAN adapter Wi-Fi:

   Connection-specific DNS Suffix  . : flets-east.jp
   Description . . . . . . . . . . . : Intel(R) Dual Band Wireless-AC 8265
   Physical Address. . . . . . . . . : 50-76-AF-46-C1-E3
   DHCP Enabled. . . . . . . . . . . : Yes
   Autoconfiguration Enabled . . . . : Yes
   IPv6 Address. . . . . . . . . . . : 2408:210:ae26:b200:71e1:988f:3eff:332b(Preferred)
   Temporary IPv6 Address. . . . . . : 2408:210:ae26:b200:2cd4:24e3:20a2:42e1(Preferred)
   Link-local IPv6 Address . . . . . : fe80::71e1:988f:3eff:332b%5(Preferred)
   IPv4 Address. . . . . . . . . . . : 192.168.11.3(Preferred)
   Subnet Mask . . . . . . . . . . . : 255.255.255.0
   Lease Obtained. . . . . . . . . . : 2021年6月21日 10:13:56
   Lease Expires . . . . . . . . . . : 2021年6月23日 10:13:56
   Default Gateway . . . . . . . . . : fe80::10ff:fe04:2060%5
                                       192.168.11.1
   DHCP Server . . . . . . . . . . . : 192.168.11.1
   DHCPv6 IAID . . . . . . . . . . . : 72382127
   DHCPv6 Client DUID. . . . . . . . : 00-01-00-01-23-B7-41-C2-E8-6A-64-49-83-24
   DNS Servers . . . . . . . . . . . : 2404:1a8:7f01:b::3
                                       2404:1a8:7f01:a::3
                                       192.168.11.1
   NetBIOS over Tcpip. . . . . . . . : Enabled
   Connection-specific DNS Suffix Search List :
                                       flets-east.jp
                                       iptvf.jp

Ethernet adapter Bluetooth ネットワーク接続:

   Media State . . . . . . . . . . . : Media disconnected
   Connection-specific DNS Suffix  . :
   Description . . . . . . . . . . . : Bluetooth Device (Personal Area Network)
   Physical Address. . . . . . . . . : 50-76-AF-46-C1-E7
   DHCP Enabled. . . . . . . . . . . : Yes
   Autoconfiguration Enabled . . . . : Yes

Then have you tried to reinstall the agent, just in case something goes wrong since the installation?

I uninstalled the Fusion Inventory Agent and installed it again, but the status did not change,

Finally, can you give glpi-agent a try from a nightly build as it provides a little more recent perl build? (Stop fusioninventory-agent service or even uninstall fusioninventory-agent before using glpi-agent)

I tried using the following modules but the status did not change. https://nightly.glpi-project.org/glpi-agent/#windows-1-0-git3148a19b

g-bougard commented 3 years ago

Hi @masakazuwatanabe

thank you for this complete feed back.

Honestly I still don't see how I can reproduce your problem.

Looking at your ipconfig output, I only see 2 things really different regarding my own config:

  1. you also have 2 IPv6 DNS servers in your wifi connection where I only have 1 IPv4 one
  2. you have a "Primary Dns Suffix" and a "DNS Suffix Search List" where the two are empty for me These 2 differences will have an impact on dns resolution but I'm not convince this could really be linked to this issue as I'm sure you can use an internet browser with this settings. But who knows, this may be an issue for a perl module the agent uses.

Just to be accurate, does the problem also occurs even just after a reboot and before being connected to the vpn ? I mean can you reboot being sure you didn't connect to the vpn automatically and run the commandline version to see if it becomes stuck on the prolog query ? Thank you.

masakazuwatanabe commented 3 years ago
  1. you also have 2 IPv6 DNS servers in your wifi connection where I only have 1 IPv4 one

It seems that IPv6 is enabled by default in Windows. I don't need IPv6, so I turned it off in the adapter settings before trying it. The display related to IPv6 has disappeared from ipconfig /all. But the status did not change.

2.you have a "Primary Dns Suffix" and a "DNS Suffix Search List" where the two are empty for me These 2 differences will have an impact on dns resolution but I'm not convince this could really be linked to this issue as I'm sure you can use an internet browser with this settings. But who knows, this may be an issue for a perl module the agent uses.

"Dns Suffix" etc. is because my Windows is joined to Active Directory. Just in case, I tried it after leaving Active Directory. But the status did not change.

Just to be accurate, does the problem also occurs even just after a reboot and before being connected to the vpn? I mean can you reboot being sure you didn't connect to the vpn automatically and run the commandline version to see if it becomes stuck on the prolog query? Thank you.

My OpenVPN settings do not connect automatically at startup by default. I manually make an OpenVPN connection after booting.

Therefore, FusionInventoryAgent is executed before OpenVPN connection. And every time I get a problem and get stuck.

The value of ipconfig /all after uninstalling ESET Antivirus, uninstalling OpenVPN, and leaving ActiveDirectory is posted. (The state did not change.)

Active code page: 437

C:\Users\watanabe>ipconfig /all

Windows IP Configuration

   Host Name . . . . . . . . . . . . : WATANABE-T480s
   Primary Dns Suffix  . . . . . . . :
   Node Type . . . . . . . . . . . . : Hybrid
   IP Routing Enabled. . . . . . . . : No
   WINS Proxy Enabled. . . . . . . . : No

Wireless LAN adapter ローカル エリア接続* 3:

   Media State . . . . . . . . . . . : Media disconnected
   Connection-specific DNS Suffix  . :
   Description . . . . . . . . . . . : Microsoft Wi-Fi Direct Virtual Adapter #3
   Physical Address. . . . . . . . . : 50-76-AF-46-C1-E4
   DHCP Enabled. . . . . . . . . . . : Yes
   Autoconfiguration Enabled . . . . : Yes

Wireless LAN adapter ローカル エリア接続* 12:

   Media State . . . . . . . . . . . : Media disconnected
   Connection-specific DNS Suffix  . :
   Description . . . . . . . . . . . : Microsoft Wi-Fi Direct Virtual Adapter #4
   Physical Address. . . . . . . . . : 52-76-AF-46-C1-E3
   DHCP Enabled. . . . . . . . . . . : No
   Autoconfiguration Enabled . . . . : Yes

Ethernet adapter イーサネット:

   Media State . . . . . . . . . . . : Media disconnected
   Connection-specific DNS Suffix  . :
   Description . . . . . . . . . . . : Intel(R) Ethernet Connection (4) I219-V
   Physical Address. . . . . . . . . : E8-6A-64-49-83-24
   DHCP Enabled. . . . . . . . . . . : Yes
   Autoconfiguration Enabled . . . . : Yes

Wireless LAN adapter Wi-Fi:

   Connection-specific DNS Suffix  . :
   Description . . . . . . . . . . . : Intel(R) Dual Band Wireless-AC 8265
   Physical Address. . . . . . . . . : 50-76-AF-46-C1-E3
   DHCP Enabled. . . . . . . . . . . : Yes
   Autoconfiguration Enabled . . . . : Yes
   IPv4 Address. . . . . . . . . . . : 192.168.11.3(Preferred)
   Subnet Mask . . . . . . . . . . . : 255.255.255.0
   Lease Obtained. . . . . . . . . . : 2021年6月21日 19:20:44
   Lease Expires . . . . . . . . . . : 2021年6月23日 19:20:44
   Default Gateway . . . . . . . . . : 192.168.11.1
   DHCP Server . . . . . . . . . . . : 192.168.11.1
   DNS Servers . . . . . . . . . . . : 192.168.11.1
   NetBIOS over Tcpip. . . . . . . . : Enabled

Ethernet adapter Bluetooth ネットワーク接続:

   Media State . . . . . . . . . . . : Media disconnected
   Connection-specific DNS Suffix  . :
   Description . . . . . . . . . . . : Bluetooth Device (Personal Area Network)
   Physical Address. . . . . . . . . : 50-76-AF-46-C1-E7
   DHCP Enabled. . . . . . . . . . . : Yes
   Autoconfiguration Enabled . . . . : Yes
g-bougard commented 3 years ago

Thank you for these other tests but they just confirm what I suspected, the problem is not around IPv6 support.

Using the command line run, can you try your url disabling SSL attempt by using the url without the "s" ? --server=http://glpi01.*.********.**.jp/plugins/fusioninventory/

To test if the problem could be related to the provided openssl library, is the command run always stuck on prolog request in that case ?

masakazuwatanabe commented 3 years ago

Using the command line run, can you try your url disabling SSL attempt by using the url without the "s" ?

Yes, Command run always stacks with prologue requests.

fusioninventory-agent.bat --debug --debug --logger=stderr --server=http://glpi01.*.********.**.jp/plugins/fusioninventory/
g-bougard commented 3 years ago

So something is definitively blocking the agent in your context. Does other users in your domain have the same problem ? If yes, is there some people without the problem ? If yes there, what could be different between your environment and their ?

The idea is still to find a way to reproduce your problem. I know windows platforms can sometime become unstable with some software interactions involving weird side-effects. It's clear to me something goes wrong but it's still not clear if the agent is faulty or not. And honestly I won't investigate more if I don't see how to reproduce. Maybe you can install a virtualbox vm or even a test pc with the latest development win10 iso in US language with the least possible installed softwares and do the same tests.

masakazuwatanabe commented 3 years ago

Currently only I use it. (We plan to introduce it to a large number of other people in the future. I have confirmed it as a preliminary step.)

It may be a problem only in my environment, so I will try various methods to find the cause. If I can identify the cause, I will contact you.

Thank you for your advice. And thank you for spending your time for me.

masakazuwatanabe commented 3 years ago

I tried trial and error while looking at the source code in my environment. This issue is occurring in the code below.

https://github.com/fusioninventory/fusioninventory-agent/blob/e4192a24c2ef2f9f1d3bb354ebe6f20af230b8e6/lib/FusionInventory/Agent/HTTP/Client.pm#L89-L96

When I investigated this process, I found the following article. https://www.perlmonks.org/?node_id=54528 https://stackoverflow.com/questions/73308/true-timeout-on-lwpuseragent-request-method (It seems that some similar articles can be found by searching with "LWP :: UserAgent timeout alarm win32" etc. in Google search)

This issue appears to be a traditional issue with the combination of Perl's LWP :: UserAgent and Win32.

  1. LWP :: UserAgent cannot time out in some cases
  2. In Linux etc., it is better to use alarm to deal with the problem.
  3. But on Win32 Perl alarm doesn't work or is unstable

It looks like win32 needs a different approach to handling the target code.

g-bougard commented 3 years ago

Hi @masakazuwatanabe well I appreciate your investigation, but the pages you're pointing to was from 2001 & 2008. The first was probably related to perl 5.5 & maybe the last to perl 5.8 and even LWP::UserAgent has greatly changed since then.

We can easily test such problem and here is a test I just done on nodns.fake.com which doesn't resolve: Case 1: no dns resolving

C:\Program Files\FusionInventory-Agent>perl\bin\fusioninventory-agent.exe -MData::Dumper -MLWP::UserAgent -MHTTP::Request -e "use Time::HiRes qw(time); BEGIN { $t=time;} $ua = LWP::UserAgent->new(timeout => 5); $req = HTTP::Request->new(@ARGV); $m = $ua->request($req); print Dumper($m); END { print STDERR sprintf('TIME: %.3fs',time-$t);}" GET http://nodns.fake.com/
$VAR1 = bless( {
                 '_request' => bless( {
                                        '_headers' => bless( {
                                                               'user-agent' => 'libwww-perl/6.46'
                                                             }, 'HTTP::Headers' ),
                                        '_method' => 'GET',
                                        '_uri' => bless( do{\(my $o = 'http://nodns.fake.com/')}, 'URI::http' ),
                                        '_content' => ''
                                      }, 'HTTP::Request' ),
                 '_msg' => 'Can\'t connect to nodns.fake.com:80 (Hôte inconnu. )',
                 '_headers' => bless( {
                                        '::std_case' => {
                                                          'client-date' => 'Client-Date',
                                                          'client-warning' => 'Client-Warning'
                                                        },
                                        'client-warning' => 'Internal response',
                                        'client-date' => 'Tue, 22 Jun 2021 12:43:21 GMT',
                                        'content-type' => 'text/plain'
                                      }, 'HTTP::Headers' ),
                 '_rc' => 500,
                 '_content' => 'Can\'t connect to nodns.fake.com:80 (Hôte inconnu. )

Hôte inconnu.  at C:/Program Files/FusionInventory-Agent/perl/vendor/lib/LWP/Protocol/http.pm line 50.
'
               }, 'HTTP::Response' );
TIME: 0.130s
C:\Program Files\FusionInventory-Agent>perl\bin\fusioninventory-agent.exe --version

This is perl 5, version 32, subversion 0 (v5.32.0) built for MSWin32-x64-multi-thread

Copyright 1987-2020, Larry Wall

Perl may be copied only under the terms of either the Artistic License or the
GNU General Public License, which may be found in the Perl 5 source kit.

Complete documentation for Perl, including FAQ lists, should be found on
this system using "man perl" or "perldoc perl".  If you have access to the
Internet, point your browser at http://www.perl.org/, the Perl Home Page.

Just to explain this output:

Case 2: same case but I edited C:\Windows\system32\drivers\etc\hosts so nodns.fake.com is resolved to an ip but it can't be reached. For me I defined a line 192.168.2.144 nodns.fake.com, so it resolves to a faked private ip which can't be routed

C:\Program Files\FusionInventory-Agent>perl\bin\fusioninventory-agent.exe -MData::Dumper -MLWP::UserAgent -MHTTP::Request -e "use Time::HiRes qw(time); BEGIN { $t=time;} $ua = LWP::UserAgent->new(timeout => 5); $req = HTTP::Request->new(@ARGV); $m = $ua->request($req); print Dumper($m); END { print STDERR sprintf('TIME: %.3fs',time-$t);}" GET http://nodns.fake.com/
$VAR1 = bless( {
                 '_rc' => 500,
                 '_headers' => bless( {
                                        'content-type' => 'text/plain',
                                        'client-date' => 'Tue, 22 Jun 2021 13:04:30 GMT',
                                        '::std_case' => {
                                                          'client-warning' => 'Client-Warning',
                                                          'client-date' => 'Client-Date'
                                                        },
                                        'client-warning' => 'Internal response'
                                      }, 'HTTP::Headers' ),
                 '_msg' => 'Can\'t connect to nodns.fake.com:80 (Une tentative de connexion a échoué car le parti connecté n'a pas répondu convenablement au-delà d'une certaine durée ou une connexion établie a échoué car l'hôte de connexion n'a pas répondu.)',
                 '_request' => bless( {
                                        '_headers' => bless( {
                                                               'user-agent' => 'libwww-perl/6.46'
                                                             }, 'HTTP::Headers' ),
                                        '_uri' => bless( do{\(my $o = 'http://nodns.fake.com/')}, 'URI::http' ),
                                        '_content' => '',
                                        '_method' => 'GET'
                                      }, 'HTTP::Request' ),
                 '_content' => 'Can\'t connect to nodns.fake.com:80 (Une tentative de connexion a échoué car le parti connecté n'a pas répondu convenablement au-delà d'une certaine durée ou une connexion établie a échoué car l'hôte de connexion n'a pas répondu.)

Une tentative de connexion a échoué car le parti connecté n'a pas répondu convenablement au-delà d'une certaine durée ou une connexion établie a échoué car l'hôte de connexion n'a pas répondu. at C:/Program Files/FusionInventory-Agent/perl/vendor/lib/LWP/Protocol/http.pm line 50.
'
               }, 'HTTP::Response' );
TIME: 5.104s

Here as you can see, I got a timeout of 5s as this is what I set in LWP::UserAgent->new(timeout => 5). This is what I expected and this just show LWP::UserAgent in perl 5.32 should nicely handle timeout on windows.

Case 3: replacing http://nodns.test.com/ with https://github.com, I'm getting an answer in 2.554s as I could expect.

Do you have the same results in your environment ?

masakazuwatanabe commented 3 years ago

Hi g-bougard. Thank you for your reply. I tried the same thing.

Case 1: no dns resolving (http://nodns.fake.com/)

(Same result for VPN non-connection and VPN connection) It seems that the cause of the error and the line of code that is giving the error are different from your example.

C:\Program Files\FusionInventory-Agent>perl\bin\fusioninventory-agent.exe -MData::Dumper -MLWP::UserAgent -MHTTP::Request -e "use Time::HiRes qw(time); BEGIN { $t=time;} $ua = LWP::UserAgent->new(timeout => 5); $req = HTTP::Request->new(@ARGV); $m = $ua->request($req); print Dumper($m); END { print STDERR sprintf('TIME: %.3fs',time-$t);}" GET http://nodns.fake.com/
$VAR1 = bless( {
                 '_content' => 'write failed: ソ\ケットが接続されていないか、sendto 呼び出しを使ってデータグラム ソ\ケットで送信するときにアドレスが指定され ていないため、データの送受信を要求することは禁じられています。 at C:/Program Files/FusionInventory-Agent/perl/vendor/lib/LWP/Protocol/http.pm line 295.
',
                 '_request' => bless( {
                                        '_uri' => bless( do{\(my $o = 'http://nodns.fake.com/')}, 'URI::http' ),
                                        '_headers' => bless( {
                                                               'user-agent' => 'libwww-perl/6.46'
                                                             }, 'HTTP::Headers' ),
                                        '_method' => 'GET',
                                        '_content' => ''
                                      }, 'HTTP::Request' ),
                 '_headers' => bless( {
                                        'content-type' => 'text/plain',
                                        'client-warning' => 'Internal response',
                                        'client-date' => 'Tue, 22 Jun 2021 14:31:30 GMT',
                                        '::std_case' => {
                                                          'client-warning' => 'Client-Warning',
                                                          'client-date' => 'Client-Date'
                                                        }
                                      }, 'HTTP::Headers' ),
                 '_msg' => 'write failed: ソ\ケットが接続されていないか、sendto 呼び出しを使ってデータグラム ソ\ケットで送信するときにアドレスが指定されてい ないため、データの送受信を要求することは禁じられています。',
                 '_rc' => 500
               }, 'HTTP::Response' );
TIME: 0.126s

_msg is output in Japanese because my environment is Japanese. In an English environment, you should get the following error message.

A request to send or receive data was disallowed because the socket is not connected and (when sending on a datagram socket using a sendto call) no address was supplied.

Case 2: Resolve in hosts file and feked ip

same case but I edited C:\Windows\system32\drivers\etc\hosts so nodns.fake.com is resolved to an ip but it can't be reached. For me I defined a line 192.168.2.144 nodns.fake.com, so it resolves to a faked private ip which can't be routed)

Same as your Case 2 example.

# hosts
192.168.2.144 nodns.fake.com
C:\Program Files\FusionInventory-Agent>perl\bin\fusioninventory-agent.exe -MData::Dumper -MLWP::UserAgent -MHTTP::Request -e "use Time::HiRes qw(time); BEGIN { $t=time;} $ua = LWP::UserAgent->new(timeout => 5); $req = HTTP::Request->new(@ARGV); $m = $ua->request($req); print Dumper($m); END { print STDERR sprintf('TIME: %.3fs',time-$t);}" GET http://nodns.fake.com/
$VAR1 = bless( {
                 '_rc' => 500,
                 '_request' => bless( {
                                        '_method' => 'GET',
                                        '_headers' => bless( {
                                                               'user-agent' => 'libwww-perl/6.46'
                                                             }, 'HTTP::Headers' ),
                                        '_uri' => bless( do{\(my $o = 'http://nodns.fake.com/')}, 'URI::http' ),
                                        '_content' => ''
                                      }, 'HTTP::Request' ),
                 '_content' => 'Can\'t connect to nodns.fake.com:80 (接続済みの呼び出し先が一定の時間を過ぎても正しく応 答しなかったため、接続できませんでした。または接続済みのホストが応答しなかったため、確立された接続は失敗しました。)

接続済みの呼び出し先が一定の時間を過ぎても正しく応答しなかったため、接続できませんでした。または接続済みのホストが応答しなかったため、確立された接続は失敗しました。 at C:/Program Files/FusionInventory-Agent/perl/vendor/lib/LWP/Protocol/http.pm line 50.
',
                 '_msg' => 'Can\'t connect to nodns.fake.com:80 (接続済みの呼び出し先が一定の時間を過ぎても正しく応答し なかったため、接続できませんでした。または接続済みのホストが応答しなかったため、確立された接続は失敗しました。)',
                 '_headers' => bless( {
                                        'client-date' => 'Tue, 22 Jun 2021 14:50:21 GMT',
                                        'content-type' => 'text/plain',
                                        'client-warning' => 'Internal response',
                                        '::std_case' => {
                                                          'client-date' => 'Client-Date',
                                                          'client-warning' => 'Client-Warning'
                                                        }
                                      }, 'HTTP::Headers' )
               }, 'HTTP::Response' );
TIME: 5.171s

_msg is output in Japanese because my environment is Japanese. The content is the same as your Case 2 example.

Case 3:

replacing http://nodns.test.com/ with https://github.com, I'm getting an answer in 2.554s as I could expect.

Case 3: http://nodns.test.com/

At http://nodns.test.com/ I get the answer. But, is there a load error of ssl related module when connecting to https at the redirect destination?

C:\Program Files\FusionInventory-Agent>perl\bin\fusioninventory-agent.exe -MData::Dumper -MLWP::UserAgent -MHTTP::Request -e "use Time::HiRes qw(time); BEGIN { $t=time;} $ua = LWP::UserAgent->new(timeout => 5); $req = HTTP::Request->new(@ARGV); $m = $ua->request($req); print Dumper($m); END { print STDERR sprintf('TIME: %.3fs',time-$t);}" GET http://nodns.test.com/
$VAR1 = bless( {
                 '_content' => 'LWP will support https URLs if the LWP::Protocol::https module
is installed.
',
                 '_rc' => 501,
                 '_headers' => bless( {
                                        'client-warning' => 'Internal response',
                                        'content-type' => 'text/plain',
                                        '::std_case' => {
                                                          'client-date' => 'Client-Date',
                                                          'client-warning' => 'Client-Warning'
                                                        },
                                        'client-date' => 'Tue, 22 Jun 2021 15:11:54 GMT'
                                      }, 'HTTP::Headers' ),
                 '_request' => bless( {
                                        '_protocol' => undef,
                                        '_content' => '',
                                        '_uri' => bless( do{\(my $o = 'https://www.test.com')}, 'URI::https' ),
                                        '_method' => 'GET',
                                        '_headers' => bless( {
                                                               'user-agent' => 'libwww-perl/6.46'
                                                             }, 'HTTP::Headers' )
                                      }, 'HTTP::Request' ),
                 '_msg' => 'Can\'t load \'C:/Program Files/FusionInventory-Agent/perl/vendor/lib/auto/Net/SSLeay/SSLeay.xs.dll\' for module Net::SSLeay: load_file:指定されたモジュールが見つかりません。 (LWP::Protocol::https not installed)',
                 '_previous' => bless( {
                                         '_rc' => '302',
                                         '_headers' => bless( {
                                                                'title' => '302 Found',
                                                                'location' => 'https://www.test.com',
                                                                'client-response-num' => 1,
                                                                'date' => 'Tue, 22 Jun 2021 15:11:54 GMT',
                                                                'client-date' => 'Tue, 22 Jun 2021 15:11:54 GMT',
                                                                'connection' => 'close',
                                                                'server' => 'nginx/1.18.0',
                                                                'x-dis-request-id' => '7a10691aa4346c5eaff524d435bc5d5c',
                                                                'client-transfer-encoding' => [
                                                                                                'chunked'
                                                                                              ],
                                                                '::std_case' => {
                                                                                  'base' => 'Base',
                                                                                  'client-peer' => 'Client-Peer',
                                                                                  'x-dis-request-id' => 'X-DIS-Request-ID',
                                                                                  'title' => 'Title',
                                                                                  'client-transfer-encoding' => 'Client-Transfer-Encoding',
                                                                                  'content-base' => 'Content-Base',
                                                                                  'client-date' => 'Client-Date',
                                                                                  'client-response-num' => 'Client-Response-Num'
                                                                                },
                                                                'content-type' => 'text/html; charset=UTF-8',
                                                                'client-peer' => '69.172.200.109:80'
                                                              }, 'HTTP::Headers' ),
                                         '_msg' => 'Moved Temporarily',
                                         '_request' => bless( {
                                                                '_content' => '',
                                                                '_uri_canonical' => bless( do{\(my $o = 'http://nodns.test.com/')}, 'URI::http' ),
                                                                '_uri' => $VAR1->{'_previous'}{'_request'}{'_uri_canonical'},
                                                                '_method' => 'GET',
                                                                '_headers' => bless( {
                                                                                       'user-agent' => 'libwww-perl/6.46'
                                                                                     }, 'HTTP::Headers' )
                                                              }, 'HTTP::Request' ),
                                         '_protocol' => 'HTTP/1.1',
                                         '_content' => '<html><head><title>302 Found</title></head><body bgcolor=\'white\'><center><h1>302 Found</h1><h2>Object moved to <a href=\'https://www.test.com\'>here</a>.</h2></center><hr><center>DOSarrest Internet Security</center></body></html>
'
                                       }, 'HTTP::Response' )
               }, 'HTTP::Response' );
TIME: 0.701s

Case 3: https://github.com

Is there a load error for ssl related modules when connecting to https?

C:\Program Files\FusionInventory-Agent>perl\bin\fusioninventory-agent.exe -MData::Dumper -MLWP::UserAgent -MHTTP::Request -e "use Time::HiRes qw(time); BEGIN { $t=time;} $ua = LWP::UserAgent->new(timeout => 5); $req = HTTP::Request->new(@ARGV); $m = $ua->request($req); print Dumper($m); END { print STDERR sprintf('TIME: %.3fs',time-$t);}" GET https://github.com
$VAR1 = bless( {
                 '_headers' => bless( {
                                        'content-type' => 'text/plain',
                                        'client-date' => 'Tue, 22 Jun 2021 15:14:54 GMT',
                                        'client-warning' => 'Internal response',
                                        '::std_case' => {
                                                          'client-warning' => 'Client-Warning',
                                                          'client-date' => 'Client-Date'
                                                        }
                                      }, 'HTTP::Headers' ),
                 '_msg' => 'Can\'t load \'C:/Program Files/FusionInventory-Agent/perl/vendor/lib/auto/Net/SSLeay/SSLeay.xs.dll\' for module Net::SSLeay: load_file:指定されたモジュールが見つかりません。 (LWP::Protocol::https not installed)',
                 '_content' => 'LWP will support https URLs if the LWP::Protocol::https module
is installed.
',
                 '_rc' => 501,
                 '_request' => bless( {
                                        '_headers' => bless( {
                                                               'user-agent' => 'libwww-perl/6.46'
                                                             }, 'HTTP::Headers' ),
                                        '_uri' => bless( do{\(my $o = 'https://github.com')}, 'URI::https' ),
                                        '_method' => 'GET',
                                        '_content' => ''
                                      }, 'HTTP::Request' )
               }, 'HTTP::Response' );
TIME: 0.077s
g-bougard commented 3 years ago

Hi @masakazuwatanabe sorry I missed your comment. It seems the "SSLeay.xs.dll" has disappeared or if blocked by something, probably an antivirus. Finally this may be the reason of your problem as your URL is https and this file is required.

Can you check your antivirus activity or quarantine after the agent installation ? Can you try to reinstall the agent having the antivirus disabled ? And check C:/Program Files/FusionInventory-Agent/perl/vendor/lib/auto/Net/SSLeay/SSLeay.xs.dll is there before enabling antivirus. Test the case before and after enabling the antivirus.

masakazuwatanabe commented 3 years ago

Hi g-bougard.

  1. Initial state
    • dir "C:\Program Files\FusionInventory-Agent\perl\vendor\lib\auto\Net\SSLeay\SSLeay.xs.dll"
    • SSLeay.xs.dll exists in place
  2. Run virus scan
    • No problem
  3. Uninstall FusionInventoryAgent
  4. Uninstall AntiVirus (ESET)
  5. Disable Microsoft Defender
  6. Reboot
  7. FusionInventoryAgent installation
  8. FusionInventoryAgent service stopped
  9. run cmd.exe (Administrator)
    • dir "C:\Program Files\FusionInventory-Agent\perl\vendor\lib\auto\Net\SSLeay\SSLeay.xs.dll"
    • SSLeay.xs.dll exists in place
  10. run
cd C:\Program Files\FusionInventory-Agent
perl\bin\fusioninventory-agent.exe -MData::Dumper -MLWP::UserAgent -MHTTP::Request -e "use Time::HiRes qw(time); BEGIN { $t=time;} $ua = LWP::UserAgent->new(timeout => 5); $req = HTTP::Request->new(@ARGV); $m = $ua->request($req); print Dumper($m); END { print STDERR sprintf('TIME: %.3fs',time-$t);}" GET https://github.com

But the result did not change.

'_msg' => 'Can\'t load \'C:/Program Files/FusionInventory-Agent/perl/vendor/lib/auto/Net/SSLeay/SSLeay.xs.dll\' for module Net::SSLeay: load_file:指定されたモジュールが見つかりません。 (LWP::Protocol::https not installed)',

Am I missing the path of the environment variable in my environment? Is there a path for the development environment set when starting with a service or in your environment?

g-bougard commented 3 years ago

Indeed the problem is maybe in the loading of another library which SSLeay.xs.dll can depend on.

Can you check if you have the following libraries under C:\Program Files\FusionInventory-Agent\perl\bin:

  1. libcrypto-1_1-x64__.dll
  2. libssl-1_1-x64__.dll

Can you run the following perl oneliner and give us the result ? "C:\\Program Files\FusionInventory-Agent\perl\bin\fusioninventory-agent.exe" -e "use Net::SSLeay; print Net::SSLeay::SSLeay_version(0),' (', sprintf('0x%x',Net::SSLeay::SSLeay()),') installed with perl ',$^V;"

If you found the libcrypto (1) & libssl (2) DLLs, can you copy them manually to C:/Program Files/FusionInventory-Agent/perl/vendor/lib/auto/Net/SSLeay and retry the previous test ? Does this fix the network problem ?

Finally you can also try the x86 (32 bits) version.

masakazuwatanabe commented 3 years ago

Can you check if you have the following libraries under C:\Program Files\FusionInventory-Agent\perl\bin:

  1. libcrypto-1_1-x64__.dll
  2. libssl-1_1-x64__.dll

I have both libcrypto-1_1-x64__.dll and libssl-1_1-x64__.dll.

Can you run the following perl oneliner and give us the result ?

C:\Program Files\FusionInventory-Agent>"C:\\Program Files\FusionInventory-Agent\perl\bin\fusioninventory-agent.exe" -e "use Net::SSLeay; print Net::SSLeay::SSLeay_version(0),' (', sprintf('0x%x',Net::SSLeay::SSLeay()),') installed with perl ',$^V;"
Can't load 'C:/Program Files/FusionInventory-Agent/perl/vendor/lib/auto/Net/SSLeay/SSLeay.xs.dll' for module Net::SSLeay: load_file:指定されたモジュールが見つかりません。 at C:/Program Files/FusionInventory-Agent/perl/lib/DynaLoader.pm line 193.
  at -e line 1.
Compilation failed in require at -e line 1.
BEGIN failed--compilation aborted at -e line 1.

If you found the libcrypto (1) & libssl (2) DLLs, can you copy them manually to C:/Program Files/FusionInventory-Agent/perl/vendor/lib/auto/Net/SSLeay and retry the previous test ? Does this fix the network problem ?

copy /b "C:\Program Files\FusionInventory-Agent\perl\bin\libcrypto-1_1-x64__.dll" "C:\Program Files\FusionInventory-Agent\perl\vendor\lib\auto\Net\SSLeay\"
copy /b "C:\Program Files\FusionInventory-Agent\perl\bin\libssl-1_1-x64__.dll" "C:\Program Files\FusionInventory-Agent\perl\vendor\lib\auto\Net\SSLeay\"

There is no change in error and output. It does not resolve previous tests and network problem.

Finally you can also try the x86 (32 bits) version.

I uninstalled the 64bit version and installed the 32bit version. There is no change in error and output. It does not resolve previous tests and network problem.

g-bougard commented 3 years ago

Can you cd in C:\Program Files\FusionInventory-Agent>"C:\\Program Files\FusionInventory-Agent\perl\bin and run the OpenSSL oneliner ? Can you report your PATH variable ? Now I'm suspecting you have another software that provides libssl DLL that conflicts with agent expected one. Maybe you have an old Strawberry Perl installed ? Your PATH variable will help to check that and even if we find what is installed this will tell us how to reproduce the problem.

masakazuwatanabe commented 3 years ago

Can you cd in C: \ Program Files \ FusionInventory-Agent> "C: \ Program Files \ FusionInventory-Agent \ perl \ bin and run the OpenSSL oneliner?

Is this all right? It doesn't seem to run openssl.

Active code page: 437
C:\WINDOWS\system32>cd C:\Program Files\FusionInventory-Agent
C:\Program Files\FusionInventory-Agent>openssl
'openssl' is not recognized as an internal or external command,
operable program or batch file.
C:\Program Files\FusionInventory-Agent>cd perl\bin
C:\Program Files\FusionInventory-Agent\perl\bin>openssl
'openssl' is not recognized as an internal or external command,
operable program or batch file.

Can you report your PATH variable ? Now I'm suspecting you have another software that provides libssl DLL that conflicts with agent expected one. Maybe you have an old Strawberry Perl installed ? Your PATH variable will help to check that and even if we find what is installed this will tell us how to reproduce the problem.

I don't have Strawberry Perl installed. I searched for perl.

c:\>dir /b /s perl.*
c:\Program Files\FusionInventory-Agent\perl
c:\Program Files\FusionInventory-Agent\perl\bin\perl.exe
c:\Program Files\FusionInventory-Agent\perl\lib\TAP\Parser\SourceHandler\Perl.pm
c:\Program Files\FusionInventory-Agent\perl\lib\unicore\lib\Perl
c:\Program Files\FusionInventory-Agent\perl\vendor\lib\Authen\SASL\Perl
c:\Program Files\FusionInventory-Agent\perl\vendor\lib\Authen\SASL\Perl.pm
c:\Program Files\FusionInventory-Agent\perl\vendor\lib\Digest\Perl

Output of PATH.

c:\>echo %path:;=&echo.%
C:\Program Files (x86)\Common Files\Oracle\Java\javapath
C:\WINDOWS\system32
C:\WINDOWS
C:\WINDOWS\System32\Wbem
C:\WINDOWS\System32\WindowsPowerShell\v1.0\
C:\WINDOWS\System32\OpenSSH\
C:\Program Files (x86)\Intel\Intel(R) Management Engine Components\DAL
C:\Program Files\Intel\Intel(R) Management Engine Components\DAL
C:\Program Files\Intel\WiFi\bin\
C:\Program Files\Common Files\Intel\WirelessCommon\
C:\Users\watanabe.DOMAIN.001\AppData\Local\Microsoft\WindowsApps

I searched for libssl.

c:\>dir /b /s libssl*
c:\Program Files\FusionInventory-Agent\perl\bin\libssl-1_1-x64__.dll
c:\Program Files\FusionInventory-Agent\perl\vendor\lib\auto\Net\SSLeay\libssl-1_1-x64__.dll
c:\Program Files\Intel\Intel(R) Management Engine Components\iCLS\libssl-1_1-x64.dll
c:\Program Files\OpenVPN\bin\libssl-1_1-x64.dll
c:\Program Files (x86)\Intel\Intel(R) Management Engine Components\iCLS\libssl-1_1.dll
c:\Program Files (x86)\Microsoft Office\root\Office16\ODBC Drivers\Salesforce\lib\LibCurl32.DllA\OpenSSL32.DllA\libssl-1_1.dll
c:\Program Files (x86)\Microsoft Office\root\Office16\ODBC Drivers\Salesforce\lib\OpenSSL32.DllA\libssl-1_1.dll
c:\Users\Administrator.DOMAIN\AppData\Roaming\Zoom\bin\libssl-1_1.dll
c:\Users\watanabe\AppData\Local\Microsoft\OneDrive\21.099.0516.0003\libssl-1_1.dll
c:\Users\watanabe.DOMAIN.001\AppData\Local\Microsoft\OneDrive\21.099.0516.0003\libssl-1_1.dll
c:\Users\watanabe.DOMAIN.001\AppData\Roaming\Zoom\bin\libssl-1_1-x64.dll
c:\Windows\System32\DriverStore\FileRepository\iclsclient.inf_amd64_75ffca5eec865b4b\lib\libssl-1_1-x64.dll
g-bougard commented 3 years ago

Intel(R) Management Engine Components

g-bougard commented 3 years ago

Can you cd in C: \ Program Files \ FusionInventory-Agent> "C: \ Program Files \ FusionInventory-Agent \ perl \ bin and run the OpenSSL oneliner?

Is this all right? It doesn't seem to run openssl.

Sorry, I was not clear. I meant run the previous perl oneliner to test detected openssl version:

"C:\\Program Files\FusionInventory-Agent\perl\bin\fusioninventory-agent.exe" -e "use Net::SSLeay; print Net::SSLeay::SSLeay_version(0),' (', sprintf('0x%x',Net::SSLeay::SSLeay()),') installed with perl ',$^V;"
masakazuwatanabe commented 3 years ago

I was able to run!

C:\WINDOWS\system32>cd C:\Program Files\FusionInventory-Agent\perl\bin
C:\Program Files\FusionInventory-Agent\perl\bin>"C:\\Program Files\FusionInventory-Agent\perl\bin\fusioninventory-agent.exe" -e "use Net::SSLeay; print Net::SSLeay::SSLeay_version(0),' (', sprintf('0x%x',Net::SSLeay::SSLeay()),') installed with perl ',$^V;"
OpenSSL 1.1.1g  21 Apr 2020 (0x1010107f) installed with perl v5.32.0

Case 3: https://github.com

C:\Program Files\FusionInventory-Agent\perl\bin>"C:\Program Files\FusionInventory-Agent\perl\bin\fusioninventory-agent.exe" -MData::Dumper -MLWP::UserAgent -MHTTP::Request -e "use Time::HiRes qw(time); BEGIN { $t=time;} $ua = LWP::UserAgent->new(timeout => 5); $req = HTTP::Request->new(@ARGV); $m = $ua->request($req); print Dumper($m); END { print STDERR sprintf('TIME: %.3fs',time-$t);}" GET https://github.com
$VAR1 = bless( {
                 '_request' => bless( {
                                        '_headers' => bless( {
                                                               'user-agent' => 'libwww-perl/6.46'
                                                             }, 'HTTP::Headers' ),
                                        '_uri' => bless( do{\(my $o = 'https://github.com')}, 'URI::https' ),
                                        '_method' => 'GET',
                                        '_content' => ''
                                      }, 'HTTP::Request' ),
                 '_msg' => 'Can\'t connect to github.com:443 (Bad file descriptor)',
                 '_headers' => bless( {
                                        'client-date' => 'Fri, 25 Jun 2021 09:20:16 GMT',
                                        'content-type' => 'text/plain',
                                        '::std_case' => {
                                                          'client-date' => 'Client-Date',
                                                          'client-warning' => 'Client-Warning'
                                                        },
                                        'client-warning' => 'Internal response'
                                      }, 'HTTP::Headers' ),
                 '_content' => 'Can\'t connect to github.com:443 (Bad file descriptor)

Bad file descriptor at C:/Program Files/FusionInventory-Agent/perl/vendor/lib/LWP/Protocol/http.pm line 50.
',
                 '_rc' => 500
               }, 'HTTP::Response' );
TIME: 0.236s
masakazuwatanabe commented 3 years ago

Add "$ua->ssl_opts( verify_hostname => 0 );" I got a response from github.com.

"C:\Program Files\FusionInventory-Agent\perl\bin\fusioninventory-agent.exe" -MData::Dumper -MLWP::UserAgent -MHTTP::Request -e "use Time::HiRes qw(time); BEGIN { $t=time;} $ua = LWP::UserAgent->new(timeout => 5); $ua->ssl_opts( verify_hostname => 0 ); $req = HTTP::Request->new(@ARGV); $m = $ua->request($req);  print Dumper($m); END { print STDERR sprintf('TIME: %.3fs',time-$t);}" GET https://github.com
g-bougard commented 3 years ago

Googling on DLL listing, I found we may try the ListDLLs MS Sysinternals tool. Can you run it on the processid of the blocked fusioninventory-agent when run from the command line ?

masakazuwatanabe commented 3 years ago
  1. Enable the Fusion Inventory Agent service
  2. Access http: // localhost: 62354 / and click "Force an Inventory".
  3. fusioninventory-agent is blocked.
  4. Find out the process ID and run Listdlls64.exe
C:\ListDlls>Listdlls64.exe -r -v 13456 > output.txt

output.txt

g-bougard commented 3 years ago

Well, that seems good... The only thing I see suspicious is that:

0x0000000022990000  0x3b000   C:\Program Files\ESET\ESET Security\eamsi.dll
    Verified:   ESET
    Publisher:  ESET
    Description:    ESET Antimalware Scan Interface DLL
    Product:    ESET Security
    Version:    8.0.2026.0
    File version:   10.17.30.0
    Create time:    Thu Nov 05 23:09:49 2020

As this is the only DLL not related to the agent or Microsoft.

Do you thing you can disable the antimalware scan interface during the tests checking this dll doesn't appear in listdlls output ?

g-bougard commented 3 years ago

I'm testing ESET Internet Security if I have some bad interactions, but this is currently version 14.2.10.0 ! Don't you need to upgrade your version ?

masakazuwatanabe commented 3 years ago

As this is the only DLL not related to the agent or Microsoft. Do you thing you can disable the antimalware scan interface during the tests checking this dll doesn't appear in listdlls output ?

  1. Uninstall ESET
  2. Uninstall FusionInventoryAgent
  3. reboot
  4. Install FusionInventoryAgent
  5. Access http: // localhost: 62354 / and click "Force an Inventory".
  6. fusioninventory-agent is blocked.
  7. Find out the process ID and run Listdlls64.exe
    c:\ListDlls>Listdlls64.exe -r -v 8952 > output02.txt

    See output02.txt.

output02.txt

I'm testing ESET Internet Security if I have some bad interactions, but this is currently version 14.2.10.0 ! Don't you need to upgrade your version ?

I use "ESET Endpoint Antivirus" for business. I checked the distribution source, but the latest version is 8.0.2028.1, so I think there is no problem. (The installer shows 8.0.2028.1)

masakazuwatanabe commented 3 years ago

The DLL was loaded just by disabling it with ESET settings. I have uninstalled ESET itself for proper testing.

g-bougard commented 3 years ago

The next step is probably to try ProcessMonitor to track agent activity before it blocks.

masakazuwatanabe commented 3 years ago

I used ProcessMonitor and found nothing special. In addition, it was difficult to understand the contents because there was a large amount of displayed information (most of which was information on normal OS behavior that was meaningless for investigating the cause).

masakazuwatanabe commented 3 years ago

When I analyzed the source code, it seems to behave as follows. In conclusion, I think it's like a bug in LWP::UserAgent.(Win32)

masakazuwatanabe commented 3 years ago

Introduction

Normal operation (in my case, if OpenVPN is connected)

When block operation (when the connection destination cannot resolve the name. (Nodns.fake.com and others)

In this case the socket was created successfully, But, No connection and writing has been done. If you enter select in that state, no event will occur.

LWP::Useragent may not anticipate such a case. Or, I misunderstand that connect is executed non-blocking even if the name cannot be resolved. (Assuming non-blocking connect causes an error in select?)

The behavior of the layer with low communication relation seems to be different depending on the OS. In the case of Linux, alarm can be used, so there may be no problem in the end. On Win32, alarm doesn't work as expected, so it may be a problem.

masakazuwatanabe commented 3 years ago

I verified it with a simple code that uses IO::Socket::IP directly as below to simplify the source code analysis.

cd C:\Program Files\FusionInventory-Agent\perl\bin
fusioninventory-agent.exe testcode.pl
use strict;
use warnings;
use Data::Dumper;
use IO::Socket::IP;

my $host = "nodns.fake.com";
my $port = 443;
my $timeout = 3;

my $sock = IO::Socket::IP->new(
    PeerHost => $host ,
    PeerPort => $port ,
    Type => SOCK_STREAM,
    Proto  => "tcp",
    Timeout => $timeout,
) || die "error: create socket.\n";
$sock->blocking(0);

print Dumper($sock);
print "is_connected:" . $sock->connected() . "\n";

my $fbits = '';
vec($fbits, fileno($sock), 1) = 1;
while(1){
    print "------------------------------\n";
    my $rbits = $fbits;
    my $wbits = $fbits;
    my $ebits = $fbits;

    print "BEGIN SELECT.\n";
    my $nfound = select($rbits, $wbits, $ebits, $timeout);
    print "nfound: $nfound\n";
    print "END   SELECT.\n";

    if(defined($rbits) && $rbits =~ /[^\0]/){
        print "readablen"
    }
    if(defined($wbits) && $wbits =~ /[^\0]/){
        print "writeable.\n"
    }
    if(defined($ebits) && $ebits =~ /[^\0]/){
        print "error.\n"
    }
    sleep 1;
}
masakazuwatanabe commented 3 years ago

Supplementary addition. In my environment getaddrinfo does not return an error code even for nodns.fake.com name resolution.

nodns.fake.com

C:\Program Files\FusionInventory-Agent\perl\bin>fusioninventory-agent.exe -e "use Data::Dumper;use Socket qw(getaddrinfo getnameinfo);print Dumper(getaddrinfo('nodns.fake.com', undef, undef))";
$VAR1 = '';

nodns.test.com

C:\Program Files\FusionInventory-Agent\perl\bin>fusioninventory-agent.exe -e "use Data::Dumper;use Socket qw(getaddrinfo getnameinfo);print Dumper(getaddrinfo('nodns.test.com', undef, undef))";
$VAR1 = '';
$VAR2 = {
          'protocol' => 0,
          'family' => 2,
          'socktype' => 0,
          'canonname' => undef,
          'addr' => '   Eャネm        '
        };
g-bougard commented 3 years ago

Hi @masakazuwatanabe

for me as the ssl check perl oneliner wasn't working when run from C:\Program Files\FusionInventory-Agent but is when run from C:\Program Files\FusionInventory-Agent\perl\bin, this means something is misbehaving during the required DLLs loading. So the problem could a problem with a DLL required by a module handling the socket layer so another one then libssl dll.

As this seems to be related to perl and we are using a dedicated perl installation based on Strawberry perl, you may try to installed the latest Strawberry perl and make your tests under this environment.

After installing Strawberry perl and reboot your computer, I would also be interested to know if the agent starts to not block on dns resolution when attempting to send its prolog request. The fact is the DLLs installed in the Strawberry perl environment could finally be used by the agent in place of the ones from its environment if Strawberry ones are finally found in place of any failing one...