glpi-project / glpi-inventory-plugin

GLPI Inventory plugin
GNU Affero General Public License v3.0
46 stars 27 forks source link

Actors keep disappearing from Tasks #538

Closed GuidoWilden closed 2 months ago

GuidoWilden commented 2 months ago

Describe the bug

We set the target as an IP range for both the network discovery and network inventory task, as an Actor a single glpi-agent runs on the server that handles both remote inventory as well as network discovery and network inventory tasks. This has worked reliably for months, however, currently the Actor keeps disappearing from the Task. We have not yet figured out what triggers the disappearance.

To reproduce

  1. Create a Task
  2. Set a Target as an IP range
  3. Set Actor to the glpi-agent
  4. Double-check the Task a day or so later
  5. Actor has vanished

Expected behavior

Once the Task is defined the parameters should remain unchanged.

Operating system

Linux

GLPI Agent version

Nightly build (git version in additional context below)

GLPI version

Other (See additional context below)

GLPIInventory plugin

1.3.5

Additional context

GLPI version 10.0.16, GLPI agent nightly build v1.11-git7b8a5f0b

stonebuzz commented 2 months ago

Hi @GuidoWilden

We've already come across this case. It turned out that the workstation had both the FusionInventory agent and the GLPI inventory agent.

When the FusionInventory agent did the invetary, it deleted the GLPI agent connected to this workstation (and vice versa).

So if a task was prepared with the GLPI agent, the actor was deleted.

perhaps this is the case for you?

GuidoWilden commented 2 months ago

Hi @stonebuzz, thank you for your response but this is not my case no.

The server is fairly new and never had the fusion inventory plugin installed. Sorry ...

stonebuzz commented 2 months ago

And the computer in question has only one agent installed ?

Can you check in the agent (history tab on glpi) whether he keeps changing PCs?

GuidoWilden commented 2 months ago

That is indeed a little confusing. If I list it on command line I get:

apt list --installed | grep glpi-agent

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

glpi-agent-task-esx/now 1:1.11-git7b8a5f0b all [installed,local]
glpi-agent-task-network/now 1:1.11-git7b8a5f0b all [installed,local]
glpi-agent/now 1:1.11-git7b8a5f0b all [installed,local]

However in the GUI it appears to flip flop or am I missing something?

Screenshot 2024-08-30 at 15 16 04
stonebuzz commented 2 months ago

Here you are on a computer

can you click here

image

and then look here

image

GuidoWilden commented 2 months ago

OK is this what you are after?

Screenshot 2024-08-30 at 15 22 46
stonebuzz commented 2 months ago

if the linked PC does not change to another

this is not the case here

and I can't see what would cause the agent to be removed from the task

stonebuzz commented 2 months ago

task entity is the same as computer entity ?

image

GuidoWilden commented 2 months ago

It is yes, and currently the Actor is also listed. Not sure if this is still true on Monday.

Screenshot 2024-08-30 at 15 35 56
GuidoWilden commented 2 months ago

Just like I thought: I just checked the Tasks and the Actors have vanished again.

stonebuzz commented 2 months ago

Wouldn't the agent in question be deleted and then recreated at some point?

GuidoWilden commented 2 months ago

That's not the behaviour I would expect tot see. There is only one Agent on our whole estate and its running on the server or am I missing something?

g-bougard commented 2 months ago

Hi @GuidoWilden

did you try to install glpi-agent with AppImage or snap before using rpm package ? If such a installation still exists on the computer, it may also produce such a comportment.

You still can check if you have another glpi-agent process running.

GuidoWilden commented 2 months ago

The server runs Ubuntu so no rpm. We always install via

perl <PACKAGE_NAME> --install --verbose

I can not see any other agent process running

ps -ef | grep agent
root       94086       1  0 Aug29 ?        00:16:36 glpi-agent (tag ldnaz): waiting
root      294118   94086 12 10:17 ?        00:02:18 glpi-agent (tag ldnaz): processing OOB VLAN network scan request
root      298109    3191  0 10:35 pts/3    00:00:00 grep --color=auto agent
stonebuzz commented 2 months ago

I think the agent is deleted then recreated, if I go back to your screenshot we can see that it was created on the 29th.

image

can you show me the agent's history

GuidoWilden commented 2 months ago

Is that not the moment I installed the latest update?

It looks like this now

Screenshot 2024-09-02 at 10 39 50
stonebuzz commented 2 months ago

So yes, the agent was deleted and then recreated.

Can you run the agent to see if it switches to GLPI?

GuidoWilden commented 2 months ago

Sorry I don't think I follow. Do you want me to run the agent on command line? And the observe what exactly?

g-bougard commented 2 months ago

Weird, you first screenshot is showing a different deviceid than the second: timestamp is different ldnaz-wiki01-2024-04-09-11-46-05 vs ldnaz-wiki01-2024-06-19-09-30-16

This means you effectively have 2 agents, one installed in april and one in june which are claiming the same computer.

Are you using VM or Containers ? If yes, are you using a clone which may have the same S/N and/or UUID ?

stonebuzz commented 2 months ago

Do you want me to run the agent on command line?

Yes

And the observe what exactly?

I Agent ID change

image

stonebuzz commented 2 months ago

and if you get errors in the GLPI log files (/glpi/files/_log/ php-errors.log or sql-errors.log)

stonebuzz commented 2 months ago

what is the result of the following commands :

sudo updatedb 
locate agent.cgf

Wouldn't you have cloned the server?

GuidoWilden commented 2 months ago
  1. OK its a Hyper V VM running Ubuntu so no container.

  2. Commands:

sudo updatedb 
sudo: updatedb: command not found

locate agent.cgf retuns nothing.

  1. When you say cloned server I'm not too sure what you mean. The server has been restored from a snapshot when I broke things. Not sure if this is what you mean.

  2. When I run the agent on command line the ID does indeed change:

Screenshot 2024-09-02 at 11 23 02

What could be the cause for this?

stonebuzz commented 2 months ago

either it actually has 2 agents on the same machine, or it has 2 cloned machines with the same S/N and or UUID

GuidoWilden commented 2 months ago

I can see only one Agent:

root@ldnaz-wiki01:/var/log/glpi# apt list --installed | grep agent

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

glpi-agent-task-esx/now 1:1.11-git7b8a5f0b all [installed,local]
glpi-agent-task-network/now 1:1.11-git7b8a5f0b all [installed,local]
glpi-agent/now 1:1.11-git7b8a5f0b all [installed,local]
gpg-agent/jammy-updates,jammy-security,now 2.2.27-3ubuntu2.1 amd64 [installed,automatic]
libpolkit-agent-1-0/jammy,now 0.105-33 amd64 [installed,automatic]
lxd-agent-loader/jammy,now 0.5 all [installed,automatic]

It finds only one machine against the serial number and from what I can see I can not search by UUID.

trasher commented 2 months ago

[...] This means you effectively have 2 agents, one installed in april and one in june which are claiming the same computer.

This does not seems to be a bug, I close

GuidoWilden commented 2 months ago

Today I restored a backup/snapshot from July 27th that was seemingly not affected by the problem, however, now about four hours later I am faced with the exact same problem. I have been singing the praises for GLPI for quite some time now but am close to giving up on the product. Its just too fragile and temperamental to be used in production as it stands.

stonebuzz commented 2 months ago

This sounds like a fairly complex problem to solve, perhaps you should consider a professional subscription, which would allow us (under contract) to have access to your GLPI / infrastructure etc ... and be able to help you in better conditions.

stonebuzz commented 2 months ago

do you have errors in the GLPI log files (/glpi/files/_log/ php-errors.log or sql-errors.log) when the agent pushes the inventory?

GuidoWilden commented 2 months ago

I have been searching for clones with UUID and serial no. but even if I use only snippets of the two I can't find anything. There is guaranteed only one agent installed.

I tried tracing the logs but nothing sticks out to me. Filtered logs only for today attached. Archive.zip

trasher commented 2 months ago

Today I restored a backup/snapshot from July 27th that was seemingly not affected by the problem, however, now about four hours later I am faced with the exact same problem. I have been singing the praises for GLPI for quite some time now but am close to giving up on the product. Its just too fragile and temperamental to be used in production as it stands.

If the issue is caused by a dual agent installation on one or several computer, you can restore GLPI database, that won't change anything.

From the information you provided, there is no bug plugin or inventory side; but rather something wrong on your infrastructure setup. There are a lot of errors in your logs, but nothing that seems directly related to your initial issue.

Also, you affirm there is no duplicate don your side, but @g-bougard stated:

Weird, you first screenshot is showing a different deviceid than the second: timestamp is different ldnaz-wiki01-2024-04-09-11-46-05 vs ldnaz-wiki01-2024-06-19-09-30-16

So it seems there is something wrong/changed on your side.