A previous attempt to pull the dot had resulted in it being created on disk, but it was not in the KV store.
During startup, Dotmesh found the dot on disk so created an fsmachine, but it went to the failed state as it could not be located in the KV store. It wasn't in the registry so it was invisible to normal observers, but its fsid was claimed by the failed fsmachine.
Attempts to pull the dot were then routed to the failed fsmachine, which never picked up the request, as it was in the failed state, so the pull just hung.
Better handling in general for a pull when the fsid is already in existence. dm clone remote dot --local-name a ; dm clone remote dot --local-name b also fails messily. We can detect this case and abort. - https://github.com/dotmesh-io/dotmesh/issues/674
Better handling for things found in ZFS and not in the KV store. Either store the KV metadata in ZFS as well so it can always be recovered, or just delete things found in ZFS that aren't in the KV store so they don't linger as zombies (this is ideal for runners that just cache dots, not so ideal for central hubs), or bring them back in a "lost+found" state in the admin namespace with names based on the fsid, or a configurable choice between the last two.
While investigating https://github.com/dotmesh-io/ds-runner/issues/31 a situation was found where a pull hung forever.
A previous attempt to pull the dot had resulted in it being created on disk, but it was not in the KV store.
During startup, Dotmesh found the dot on disk so created an fsmachine, but it went to the failed state as it could not be located in the KV store. It wasn't in the registry so it was invisible to normal observers, but its fsid was claimed by the failed fsmachine.
Attempts to pull the dot were then routed to the failed fsmachine, which never picked up the request, as it was in the failed state, so the pull just hung.
Possible things to improve:
dm clone remote dot --local-name a ; dm clone remote dot --local-name b
also fails messily. We can detect this case and abort. - https://github.com/dotmesh-io/dotmesh/issues/674