Closed hucker75 closed 6 months ago
The top client.db has the client identity, among other things. You should not copy it when cloning.
When a client is linked to an account, it will pick up user, team, passkey, cause.
You should not copy the work directory either.
Cloning is a bad idea in general.
I believe the mechanism for partially configuring new clients in a mass deployment is to craft and pre-install a shared config.xml. Nothing else is safe.
In future I'll just uninstall/reinstall Folding after the clone. That seems to have fixed it this time.
Oddly the previous Beta didn't mind.
The previous beta did not have accounts.
I thought it still would have confused things by asking for a work unit when the server thought it already had one, as it thinks it's the same machine. Or did the server assume I had twice as many CPU cores and GPUs as I did?
One of the two clones is now not showing up on the other machines' list on web control. Not sure why. I'll try reinstalling and making sure the items you said above have been deleted. Even telling the uninstaller to delete "data", it remembered my account, or was that the browser?
I don't understand why the reinstall worked then failed by the next day.
The browser stores your login.
I don’t know how the ID was used previously. Or by servers.
I just rebooted both machines, and now they both appear, I don't like intermittent problems.
Agreed.
There has been a problem with clients disconnecting from the node. It seems to mostly be on Windows around zero hours UTC. The cause has not been tracked down yet. A client restart is sufficient to reconnect.
So it's just coincidence it was one of the cloned machines? I have 6 other machines which aren't cloned, and they haven't disconnected.
I did try restarting both Folding clients, this didn't fix it, I had to restart Windows.
Interesting. Joseph might know if it's more than coincidence.
And last night two different machines did it, while the cloned pair stayed on ok. Looks like it's nothing to do with the type of machine.
To get them back on, logging out and in on the machine which had disappeared didn't help, I had to exit and relaunch folding.
Cloning is a bad idea in general.
It's a common thing for those of us with many machines. Folding should be able to spot there are two identical machines, and get them to regenerate an ID.
Cloning is a bad idea in general.
It's a common thing for those of us with many machines. Folding should be able to spot there are two identical machines, and get them to regenerate an ID.
I was speaking of just cloning the fah data directory.
For a client to know it has a duplicate id, it would probably need to also use some hardware id. I don't think @jcoffland wants to do that.
Some people have the data directory on a flash drive, and use it on different machines, one at a time. Any duplicate detection scheme would have to account for that. Ideas are welcome.
I think the detection should come from the server when two machines connect at once claiming to be the same one.
Ah, so just a collision detection? Ask duplicate clients to regenerate their id?
Looks like the client id is sent when requesting work. I don't know if the work upload client id needs to match.
Yes, collision detection. The current alternative of uninstalling and reinstalling destroys the current job anyway.
Collision detection of machine name might also be useful.
A client's F@H ID is computed from it's RSA key pair. Specifically from it's public key. If you copy the client DB then it will have the same ID and appear to be the same machine on the F@H network.
This could be a problem for others since cloning VMs or Docker instances is quite common. One solution is to make sure to delete /var/lib/fah-client/client.db
before cloning a F@H instance. Another possibility would be for the client to try to detect the machine ID and regenerate its ID if it detects a change in the ID of the machine it's on.
The difficulty with the later solution is that Linux, Windows and macOS will all have different methods for acquiring a unique machine ID.
We could use /etc/machine-id
. However, if you clone a Linux machine you could also copy this file but then maybe it's the user's fault.
Maybe HKLM\SOFTWARE\Microsoft\Cryptography\MachineGuid
.
Maybe ioreg -ad2 -c IOPlatformExpertDevice | plutil -extract IORegistryEntryChildren.0.IOPlatformUUID raw -
Linux
We could use
/etc/machine-id
. However, if you clone a Linux machine you could also copy this file but then maybe it's the user's fault.
Different machines with the same machine-id are in error:
https://man7.org/linux/man-pages/man5/machine-id.5.html
For operating system images which are created once and used on multiple machines, for example for containers or in the cloud, /etc/machine-id should be either missing or an empty file in the generic file system image (the difference between the two options is described under "First Boot Semantics" below). An ID will be generated during boot and saved to this file if possible. [...]
Before booting into the cloned system, truncate /etc/machine-id
(truncate -s 0 …
) and a new one will be automatically generated next time.
I assume cloning software knows this and removes the file? Mind you it doesn't bother removing the Windows network name, so maybe not. Until you run both at once, Windows doesn't do anything about it. These should both be a basic part of the cloning software making the clone.
I assume cloning software knows this and removes the file?
I have no idea.
Running once the cloned system booted will probably work too:
sudo truncate -s 0 /etc/machine-id
sudo reboot
(I recommend truncating rather than removing it to preserve extended attributes like SELinux context)
Sudo :-) Don't we all log in as root? Makes things so much easier.
Agreed.
There has been a problem with clients disconnecting from the node. It seems to mostly be on Windows around zero hours UTC. The cause has not been tracked down yet. A client restart is sufficient to reconnect.
Just saw it happen at midnight UTC. Everything disconnected, then all but one reconnected. Some kinda reset is going on.
-- never mind -- I think the client.db key is only used to link the machine to the account via the node server. No big deal if it is lost, I think.
I think the new 8.3.6 code might have a surprise effect that causes a work unit to be lost/dumped
I'm assuming that turning in a work unit requires having access to the private key used to request it.
If someone uses 8.3.6, gets a new work unit, and then uses that same client.db on another 8.3.6, then the old RSA key used to turn in the work unit will be deleted.
If you cause the client to change keys then it will have a different ID and will not be able to continue any preexisting WUs. This is intentional. Say for example, you copy a fah-client install to a new machine. The client will detect that it's on a new machine and generate a new key. Then it will discard any WUs that were from the old client ID. The original copy can continue to run as normal.
I suppose the downside is that if you want to move a client to a new machine and finish it's WUs on the new machine but this is just not a scenario we support. If you really wanted to do it you could modify the machine-id
in the client.db
to match the new machine. This would prevent the client from generating a new key.
Should be fixed at least as of v8.3.16.
I have two almost identical machines, I cloned one from the other. With the old Beta it was ok, but the new Beta thinks they're the same machine. Where is this info stored?